apis lie. screens never do.
automate any windows application through native accessibility apis. no computer vision. no fragile selectors. just the ui tree.
import { createStep } from "@mediar-ai/workflow";
export const fillForm = createStep({
id: "fill_form",
execute: async ({ desktop, logger }) => {
// Find and click submit button
const btn = await desktop
.locator("role:Button")
.first(2000);
await btn.click();
// Type in email field
const input = await desktop
.locator("role:Edit")
.first(2000);
input.typeText("user@example.com");
},
});computer vision is a hack. accessibility apis are the truth.
direct access to windows ui automation. no screenshots, no vision models.
same code, same result. every time. accessibility trees don't lie.
familiar locator patterns. if you know playwright, you know terminator.
mit licensed. inspect every line. no vendor lock-in.