- #659
f9a435e
Thanks @miguelg719! - Added native support for Google Generative models (Gemini)
-
#647
ca5467d
Thanks @seanmcguire12! - collapse redundant text nodes into parent elements -
#636
9037430
Thanks @seanmcguire12! - fix token act metrics and inference logging being misplaced as observe metrics and inference logging -
#648
169e7ea
Thanks @seanmcguire12! - add mapping of node id -> url -
#654
57a9853
Thanks @seanmcguire12! - fix repeated up & down scrolling bug for clicks insideact
-
#624
cf167a4
Thanks @seanmcguire12! - export stagehand error classes so they can be referenced from @dist -
#640
178f5f0
Thanks @yash1744! - Added support for stagehand agents to automatically redirect to https://google.com when the page URL is empty or set to about:blank, preventing empty screenshots and saving tokens. -
#633
86724f6
Thanks @miguelg719! - Fix the getBrowser logic for redundant api calls and throw informed errors -
#656
c630373
Thanks @seanmcguire12! - parse out % signs from variables in act -
#637
944bbbf
Thanks @kamath! - Fix: forward along the stack trace in StagehandDefaultError
-
#591
e234a0f
Thanks @miguelg719! - Announcing Stagehand 2.0! 🎉We're thrilled to announce the release of Stagehand 2.0, bringing significant improvements to make browser automation more powerful, faster, and easier to use than ever before.
- Introducing
stagehand.agent
: A powerful new way to integrate SOTA Computer use models or Browserbase's Open Operator into Stagehand with one line of code! Perfect for multi-step workflows and complex interactions. Learn more - Lightning-fast
act
andextract
: Major performance improvements to make your automations run significantly faster. - Enhanced Logging: Better visibility into what's happening during automation with improved logging and debugging capabilities.
- Comprehensive Documentation: A completely revamped documentation site with better examples, guides, and best practices.
- Improved Error Handling: More descriptive errors and better error recovery to help you debug issues faster.
- Better TypeScript Support: Enhanced type definitions and better IDE integration
- Better Error Messages: Clearer, more actionable error messages to help you debug faster
- Improved Caching: More reliable action caching for better performance
We're excited to see what you build with Stagehand 2.0! For questions or support, join our Slack community.
For more details, check out our documentation.
- Introducing
-
#588
ba9efc5
Thanks @sameelarif! - Added support for offloading agent tasks to the API. -
#600
11e015d
Thanks @sameelarif! - Added astagehand.history
array which stores an array ofact
,extract
,observe
, andgoto
calls made. Since this history array is stored on theStagehandPage
level, it will capture methods even if indirectly called by an agent. -
#601
1d22604
Thanks @seanmcguire12! - add custom error classes -
#599
75d8fb3
Thanks @miguelg719! - cleaner logging with pino -
#609
c92295d
Thanks @kamath! - Removed deprecated fields and methods from Stagehand constructor and added cdpUrl to localBrowserLaunchOptions for custom CDP URLs support. -
#571
73d6736
Thanks @miguelg719! - You can now use Computer Using Agents (CUA) natively in Stagehand for both Anthropic and OpenAI models! This unlocks a brand new frontier of applications for Stagehand users 🤘 -
#619
7b0b996
Thanks @sameelarif! - add disablePino flag to stagehand constructor params -
#620
566e587
Thanks @kamath! - You can now pass in an OpenAI instance as anllmClient
to the Stagehand constructor! This allows you to use Stagehand with any OpenAI-compatible model, like Ollama, Gemini, etc., as well as OpenAI wrappers like Braintrust. -
#586
c57dc19
Thanks @sameelarif! - Added native Stagehand agentic loop functionality. This allows you to build agentic workflows with a single prompt without using a computer-use model. To try it out, create astagehand.agent
without passing in a provider.
-
#580
179e17c
Thanks @seanmcguire12! - refactor _performPlaywrightMethod -
#608
71ee10d
Thanks @seanmcguire12! - added support for "scrolling to next/previous chunk" -
#594
e483484
Thanks @seanmcguire12! - pass observeHandler into actHandler -
#569
17e8b40
Thanks @seanmcguire12! - you can now call stagehand.metrics to get token usage metrics. you can also set logInferenceToFile in stagehand config to log the entire call/response history from stagehand & the LLM. -
#617
affa564
Thanks @seanmcguire12! - use a11y tree for default extract -
#589
0c4b1e7
Thanks @miguelg719! - Added CDP support for screenshots, find more about the benefits here: https://docs.browserbase.com/features/screenshots#why-use-cdp-for-screenshots%3F -
#584
c7c1a80
Thanks @miguelg719! - Fix to remove unnecessary healtcheck ping on sdk -
#616
2a27e1c
Thanks @miguelg719! - Fixed new opened tab handling for CUA models -
#582
dfd24e6
Thanks @seanmcguire12! - support api usage for extract with no args -
#563
98166d7
Thanks @seanmcguire12! - support scrolling inact
-
#598
53889d4
Thanks @miguelg719! - Fix the open operator handler to work with anthropic -
#605
b8beaec
Thanks @sameelarif! - Added support for resuming a Stagehand session created on the API. -
#612
cd36068
Thanks @seanmcguire12! - remove all logic related to dom based act -
#577
4fdbf63
Thanks @seanmcguire12! - remove debugDom -
#603
2a14a60
Thanks @seanmcguire12! - rm unused handlePossiblePageNavigation -
#614
a59eaef
Thanks @kamath! - override whatwg-url to avoid punycode warning -
#573
c24f3c9
Thanks @seanmcguire12! - return act result in actFromObserve
-
#518
516725f
Thanks @sameelarif! -act()
can now useobserve()
under the hood, resulting in significant performance improvements. To opt-in to this change, setslowDomBasedAct: false
inActOptions
. -
#483
8c9445f
Thanks @seanmcguire12! - When usingtextExtract
, you can now do targetted extraction by passing an xpath string into extract via theselector
parameter. This limits the dom processing step to a target element, reducing tokens and increasing speed. For example:const weatherData = await stagehand.page.extract({ instruction: "extract the weather data for Sun, Feb 23 at 11PM", schema: z.object({ temperature: z.string(), weather_description: z.string(), wind: z.string(), humidity: z.string(), barometer: z.string(), visibility: z.string(), }), modelName, useTextExtract, selector: xpath, // xpath of the element to extract from });
-
#556
499a72d
Thanks @kamath! - You can now set a timeout for dom-based stagehand act! Do this inact
withtimeoutMs
as a parameter, or set a global param toactTimeoutMs
in Stagehand config. -
#544
55c9673
Thanks @seanmcguire12! - you can now deterministically get the full text representation of a webpage by callingextract()
(with no arguments) -
#538
d898d5b
Thanks @sameelarif! - Addedgpt-4.5-preview
andclaude-3-7-sonnet-latest
as supported models. -
#523
44cf7cc
Thanks @kwt00! You can now natively run Cerebras LLMs!cerebras-llama-3.3-70b
andcerebras-llama-3.1-8b
are now supported models as long asCEREBRAS_API_KEY
is set in your environment. -
#542
cf7fe66
Thanks @sankalpgunturi! You can now natively run Groq LLMs!groq-llama-3.3-70b-versatile
andgroq-llama-3.3-70b-specdec
are now supported models as long asGROQ_API_KEY
is set in your environment.
-
#506
e521645
Thanks @miguelg719! - fixing 5s timeout on actHandler -
#535
3782054
Thanks @miguelg719! - Adding backwards compatibility to new act->observe pipeline by accepting actOptions -
#508
270f666
Thanks @miguelg719! - Fixed stagehand to support multiple pages with an enhanced context -
#559
18533ad
Thanks @seanmcguire12! - fix: continuously adjusting chunk size insideact
-
#554
5f1868b
Thanks @seanmcguire12! - fix targetted extract issue with scrollintoview and not chunking correctly -
#555
fc5e8b6
Thanks @seanmcguire12! - fix issue where processAllOfDom doesnt scroll to end of page when there is dynamic content -
#552
a25a4cb
Thanks @seanmcguire12! - accept xpaths with 'xpath=' prepended to the front in addition to xpaths without -
#534
f0c162a
Thanks @seanmcguire12! - call this.end() if the process exists -
#528
c820bfc
Thanks @seanmcguire12! - handle attempt to close session that has already been closed when using the api -
#520
f49eebd
Thanks @miguelg719! - Performing act from a 'not-supported' ObserveResult will now throw an informed error
- #509
a7d345e
Thanks @miguelg719! - Bun runs will now throw a more informed error
-
#486
33f2b3f
Thanks @sameelarif! - [Unreleased] Parameterized offloading Stagehand method calls to the Stagehand API. In the future, this will allow for better observability and debugging experience. -
#494
9ba4b0b
Thanks @pkiv! - Added LocalBrowserLaunchOptions to provide comprehensive configuration options for local browser instances. Deprecated the top-level headless option in favor of using localBrowserLaunchOptions.headless -
#500
a683fab
Thanks @miguelg719! - Including Iframes in ObserveResults. This appends any iframe(s) found in the page to the end of observe results on any observe call. -
#504
577662e
Thanks @sameelarif! - Enabled support for Browserbase captcha solving after page navigations. This can be enabled with the new constructor parameter:waitForCaptchaSolves
. -
#496
28ca9fb
Thanks @sameelarif! - Fixed browserbaseSessionCreateParams not being passed in to the API initialization payload.
-
#459
62a29ee
Thanks @seanmcguire12! - create a11y + dom hybrid input for observe -
#463
e40bf6f
Thanks @seanmcguire12! - include 'Scrollable' annotations in a11y-dom hybrid -
#480
4c07c44
Thanks @miguelg719! - Adding a fallback try on actFromObserveResult to use the description from observe and call regular act. -
#487
2c855cf
Thanks @seanmcguire12! - update refine extraction prompt to ensure correct schema is used
-
#426
bbbcee7
Thanks @miguelg719! - Observe got a major upgrade. Now it will return a suggested playwright method with any necessary arguments for the generated candidate elements. It also includes a major speedup when using a11y tree processing for context. -
#452
16837ec
Thanks @kamath! - add o3-mini to availablemodel -
#441
1032d7d
Thanks @seanmcguire12! - allow act to accept observe output
-
#458
da2e5d1
Thanks @miguelg719! - Updated getAccessibilityTree() to make sure it doesn't skip useful nodes. Improved getXPathByResolvedObjectId() to account for text nodes and not skip generation -
#448
b216072
Thanks @seanmcguire12! - improve handling of radio button clicks -
#445
5bc514f
Thanks @miguelg719! - Adding back useAccessibilityTree param to observe with a deprecation warning/error indicating to use onlyVisible instead
- #428
5efeb5a
Thanks @seanmcguire12! - temporarily remove vision
- #422
a2878d0
Thanks @miguelg719! - Fixing a build type error for async functions being called inside evaulate for observeHandler.
-
#412
4aa4813
Thanks @miguelg719! - Includes a new format to get website context using accessibility (a11y) trees. The new context is provided optionally with the flag useAccessibilityTree for observe tasks. -
#417
1f2b2c5
Thanks @sameelarif! - Simplify Stagehand method calls by allowing a simple string input instead of an options object. -
#405
0df1e23
Thanks @seanmcguire12! - in ProcessAllOfDom, scroll on large scrollable elements instead of just the root DOM -
#373
ff00965
Thanks @sameelarif! - Allow the input of custom instructions into the constructor so that users can guide, or provide guardrails to, the LLM in making decisions.
-
#362
9c20de3
Thanks @seanmcguire12! - reduce collisions and improve accuracy of textExtract -
#413
737b4b2
Thanks @seanmcguire12! - remove topMostElement check when verifying visibility of text nodes
-
#374
207244e
Thanks @sameelarif! - Pass in a Stagehand Page object into theon("popup")
listener to allow for multi-page handling. -
#367
75c0e20
Thanks @kamath! - Logger in LLMClient is inherited by default from Stagehand. Named rather than positional arguments are used in implemented LLMClients. -
#385
5899ec2
Thanks @sameelarif! - Moved the LLMClient logger paremeter to the createChatCompletion method options. -
#364
08907eb
Thanks @kamath! - exposed llmClient in stagehand constructor
-
#383
a77efcc
Thanks @sameelarif! - Unified LLM input/output types for reduced dependence on OpenAI types -
#353
5c6f14b
Thanks @kamath! - Throw custom error if context is referenced without initialization, remove act/extract handler from index -
#360
89841fc
Thanks @kamath! - Remove stagehand nav entirely -
#379
b1c6579
Thanks @seanmcguire12! - dont require LLM Client to use non-ai stagehand functions -
#382
a41271b
Thanks @sameelarif! - Added example implementation of the Vercel AI SDK as an LLMClient -
#344
c1cf345
Thanks @kamath! - Remove duplicate logging and expose Page/BrowserContext types
-
#324
cd23fa3
Thanks @kamath! - Move stagehand.act() -> stagehand.page.act() and deprecate stagehand.act() -
#319
bacbe60
Thanks @kamath! - We now wrap playwright page/context within StagehandPage and StagehandContext objects. This helps us augment the Stagehand experience by being able to augment the underlying Playwright -
#324
cd23fa3
Thanks @kamath! - moves extract and act -> page and deprecates stagehand.extract and stagehand.observe
-
#316
902e633
Thanks @kamath! - rename browserbaseResumeSessionID -> browserbaseSessionID -
#296
f11da27
Thanks @kamath! - - Deprecate fields ininit
in favor of constructor options- Deprecate
initFromPage
in favor ofbrowserbaseResumeSessionID
in constructor - Rename
browserBaseSessionCreateParams
->browserbaseSessionCreateParams
- Deprecate
-
#304
0b72f75
Thanks @seanmcguire12! - add textExtract: an optional, text based approach to the existing extract method. textExtract often performs better on long form extraction tasks. By defaultextract
uses the existing approachdomExtract
. -
#298
55f0cd2
Thanks @kamath! - Add sessionId to public params
-
#283
b902192
Thanks @sameelarif! - allowed customization of eval config via .env -
#299
fbe2300
Thanks @sameelarif! - log playwright actions for better debugging
-
#286
9605836
Thanks @kamath! - minor improvement in action + new eval case -
#279
d6d7057
Thanks @kamath! - Add support for o1-mini and o1-preview in OpenAIClient -
#282
5291797
Thanks @kamath! - Added eslint for stricter type checking. Streamlined most of the internal types throughout the cache, llm, and handlers. This should make it easier to add new LLMs down the line, maintain and update the existing code, and make it easier to add new features in the future. Types can be checked by runningnpx eslint .
from the project directory.
-
#270
6b10b3b
Thanks @sameelarif! - add close link to readme -
#288
5afa0b9
Thanks @kamath! - add multi-region support for browserbase -
#284
474217c
Thanks @kamath! - Build wasn't working, this addresses tsc failure. -
#236
85483fe
Thanks @seanmcguire12! - reduce chunk size
- #266
0e8f34f
Thanks @kamath! - Install wasn't working from NPM due to misconfigured build step. This attempts to fix that.
- #253
598cae2
Thanks @sameelarif! - clean up contexts after use
-
#225
a2366fe
Thanks @sameelarif! - Ensuring cross-platform compatibility with tmp directories -
#249
7d06d43
Thanks @seanmcguire12! - fix broken evals -
#227
647eefd
Thanks @kamath! - Fix debugDom still showing chunks when set to false -
#250
5886620
Thanks @seanmcguire12! - add ci specific evals -
#222
8dff026
Thanks @sameelarif! - Streamline type definitions and fix existing typescript errors -
#232
b9f9949
Thanks @kamath! - Minor changes to package.json and tsconfig, mainly around the build process. Also add more type defs and remove unused dependencies.
- #195
87a6305
Thanks @kamath! - - Adds structured and more standardized JSON logging- Doesn't init cache if
enableCaching
is false, preventingtmp/.cache
from being created - Updates bundling for browser-side code to support NextJS and serverless
- Doesn't init cache if
-
#179
0031871
Thanks @navidkpr! - Fixes:The last big change we pushed out, introduced a small regression. As a result, the gray outline showing the elements Stagehand is looking out is missing. This commit fixes that. We now process selectorMap properly now (using the updated type Record<number, string[]
Improved the action prompt:
Improved the structure Made it more straightforward Improved working for completed arg and prioritized precision over recall