使用SAM(Segment Anything Model)查找Waldo

BOTAI / 2023-05-03 / 原文

你知道你可以教#GPT3找到 Waldo 吗？ 𝚐𝚛𝚊𝚍𝚒𝚘-𝚝𝚘𝚘𝚕𝚜 版本 0.0.7 已发布，支持@MetaAI的#segmentanything模型 (SAM)

要求#GPT3找到一个穿着红白条纹的男人，Waldo 就会出现！

𝚙𝚒𝚙 𝚒𝚗𝚜𝚝𝚊𝚕𝚕 𝚐𝚛𝚊𝚍𝚒𝚘-𝚝𝚘𝚘𝚕𝚜

这是@LangChainAI代码！我对 Waldo 的描述是“man with glasses wearing sweater with red and white stripes”“戴眼镜的男人穿着红白条纹毛衣”也许你可以通过改进描述来减少误报！

#!/usr/bin/python3
# -*- coding: utf-8 -*-
#chatGPT
from langchain.agents import initialize_agent
from langchain.llms import OpenAI
from gradio_tools.tools import SAMImageSegmentationTool
from langchain.memory import ConversationBufferMemory

import pathlib

import os

os.environ["OPENAI_API_KEY"] = 'sk-zxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

llm = OpenAI(temperature=0)
memory = ConversationBufferMemory(memory_key="chat_history")
tools = [SAMImageSegmentationTool( ).langchain]
waldo_1 = pathlib.Path(__file__).parent / "waldo1.jpg"
waldo_2 = pathlib.Path(__file__).parent / "waldo_result.jpg"
agent = initialize_agent(tools, llm, memory=memory, agent="conversational-react-description", verbose=True)
output = agent.run(input=(f"please find waldo in this image: {waldo_1} .waldo is a man with distinct wardrobe of a red and white striped shirt, blue jeans, brown boots, red and white striped socks, glasses" ))
print(output)
output = agent.run(input=(f"Great job! Now find Waldo in this image: {waldo_2}"))
print(output)

原图如下：

运行过程

Loaded as API: https://curt-park-segment-anything-with-clip.hf.space ✔
C:\Users\Botai\Desktop\waldo2\waldo1.jpg
C:\Users\Botai\Desktop\waldo2\waldo_result.jpg


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Thought: Do I need to use a tool? Yes
Action: SAMImageSegmentation
Action Input: C:\Users\Botai\Desktop\waldo2\waldo1.jpg|waldo is a man with glasses wearing sweater with red and white stripes|0.9|0.8|0.85[0m
Job Status: Status.STARTING eta: None

Observation: [36;1m[1;3mC:\Users\Botai\AppData\Local\Temp\tmpq0294t4c.png[0m
Thought:[32;1m[1;3m Do I need to use a tool? No
AI: I have identified Waldo in the image. Here is the output image: C:\Users\Botai\AppData\Local\Temp\tmpq0294t4c.png[0m

[1m> Finished chain.[0m
I have identified Waldo in the image. Here is the output image: C:\Users\Botai\AppData\Local\Temp\tmpq0294t4c.png


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Thought: Do I need to use a tool? Yes
Action: SAMImageSegmentation
Action Input: C:\Users\Botai\Desktop\waldo2\waldo_result.jpg|waldo|0.9|0.8|0.85[0m
Job Status: Status.STARTING eta: None

生成的图片如下