Found 19 repositories(showing 19)
OSU-NLP-Group
[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).
OSU-NLP-Group
No description available
zytelaine
AndroidWorld 是一个面向手机端通用智能体的动态评测环境与基准套件,在真实 Android 模拟器上运行。项目内置覆盖 20 个常用应用的 116 个手工设计任务,任务参数可动态生成,形成百万级变体,并提供稳定可重复的奖励信号,用于可靠评测。除原生 Android 任务外,还集成了 MiniWoB++ Web 基准(以原生 Android 控件重现常见输入场景),支持多种智能体(如 M3A、T3A、SeeAct、随机、人类)与大模型后端(GPT、Gemini 等),同时具备轻量资源占用、可扩展任务与可验证评测(基于 SQLite/文件系统验证器)的能力。通过 run.py 可运行整套基准或子集任务,通过 minimal_task_runner.py 可快速体验单任务流程。
DehaiZhao
SeeAction: Towards Reverse Engineering How-What-Where of HCI Actions from Screencasts for UI Automation
jaaberg
SEE a problem, ACT upon it. A ESLint plugin.
chandranerella
SeeAct Webagent
cemac-tech
Repository for the SEE Activities Omniglobe feature
NikitaMalik2303
No description available
VuAnh59ht
No description available
WeijianQ
No description available
quqib
修改seeact开源项目源码来解决页面元素定位以及控件识别问题
snehasquasher
A review of the SeeAct Paper
Syclus123
No description available
BareBeaverBat
chrome extension to allow end-users to leverage the logic/behavior of the Python code in the SeeAct repository (i.e. just installing a chrome extension from chrome web store rather than having to install python and playwright locally and then download/run SeeAct).
AIM-Intelligence
No description available
MaestroBaz
SeeAction: Towards Reverse Engineering How-What-Where of HCI Actions from Screencasts for UI Automation
leonOKS23
No description available
gordonfuliman
[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…
Carasuokala
[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…
All 19 repositories loaded