Automation

Actions

OpenSteer exposes four high-level DOM actions: click, hover, input, and scroll.

Actions

OpenSteer keeps the DOM action surface small on purpose. The supported high-level actions are click, hover, input, and scroll.

click()

SDK:

await opensteer.click({ element: 3 });
await opensteer.click({ selector: "#submit" });
await opensteer.click({ persist: "primary call to action" });
await opensteer.click({
  element: 3,
  persist: "primary call to action",
  button: "left",
  clickCount: 2,
  modifiers: ["Shift"],
});

CLI:

opensteer click 3 --workspace demo
opensteer click 3 --workspace demo --persist "primary call to action"
opensteer click 3 --workspace demo --button right

hover()

SDK:

await opensteer.hover({ element: 7 });
await opensteer.hover({ selector: "[data-testid='menu']" });
await opensteer.hover({ persist: "account menu" });

CLI:

opensteer hover 7 --workspace demo
opensteer hover 7 --workspace demo --persist "account menu"

input()

SDK:

await opensteer.input({ element: 5, text: "laptop" });
await opensteer.input({ selector: "#search", text: "laptop" });
await opensteer.input({
  persist: "search input",
  text: "laptop",
  pressEnter: true,
});

CLI:

opensteer input 5 laptop --workspace demo
opensteer input 5 laptop --workspace demo --press-enter
opensteer input 5 laptop --workspace demo --persist "search input"

scroll()

SDK:

await opensteer.scroll({ direction: "down", amount: 400 });
await opensteer.scroll({ element: 10, direction: "down", amount: 250 });
await opensteer.scroll({ persist: "results panel", direction: "down", amount: 600 });

CLI:

opensteer scroll down 400 --workspace demo
opensteer scroll down 250 --workspace demo --element 10
opensteer scroll down 250 --workspace demo --element 10 --persist "results panel"

Without --element, CLI scroll targets the page root.

captureNetwork

All four SDK DOM actions can capture network traffic while they run:

await opensteer.click({
  persist: "search button",
  captureNetwork: "search",
});

CLI equivalent:

opensteer click 7 --workspace demo --capture-network search

Return value

DOM actions resolve to OpensteerActionResult:

const result = await opensteer.click({ element: 3 });

console.log(result.target.tagName);
console.log(result.target.pathHint);
console.log(result.point);

If the action used or saved a persist key, the resolved target includes that persist value.

What is not on this surface

There is no separate high-level SDK method for:

  • select
  • uploadFile
  • pressKey
  • type

For coordinate-driven interaction, use Computer Use.