Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

qhints-rs

qhints-rs scans UI elements on your screen, assigns each one a keyboard label (a “hint”), and lets you interact with them by typing — no mouse required.

Type a label → click. Hold Ctrl → hover. Press / → select text. Press Shift → drag.

screenshot

  • X11 only (Wayland not yet supported)
  • Rust rewrite of qhints
  • ~3200 lines of Rust
  • Daily-driven on i3wm with picom

Demo: https://youtu.be/BWC7h5dmkI4

Repo: https://github.com/smllb/qhints-rs

Installation

System requirements

  • X11 session (not Wayland)
  • AT-SPI D-Bus service — usually ships with your desktop. Verify:
    systemctl --user status at-spi-dbus-bus.service
    
  • Rust toolchain (install via rustup):
    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    
  • System packages:
    sudo apt install xdotool libgtk-3-dev librsvg2-dev
    
  • For the OCR backend: sudo apt install clang libclang-dev
  • A compositor recommended (tested with picom on i3)

Build

git clone https://github.com/smllb/qhints-rs
cd qhints-rs
cargo build --release

Binary at target/release/qhints-rs.

For OCR support (optional):

cargo build --release --features ocr

Note: The OCR backend downloads ~35MB of models (text-detection.rten, text-recognition.rten) from AWS S3 on first run to ~/.cache/qhints/ocrs/.

Or install directly from git:

cargo install --git https://github.com/smllb/qhints-rs

Keybinding

Bind to a shortcut in your WM config. i3 example:

bindsym ctrl+shift+p exec --no-startup-id /home/you/qhints-rs/target/release/qhints-rs

You can also use the wrapper script at scripts/run-qhints.sh (logs to syslog).

Quick Start

1. Build

git clone https://github.com/smllb/qhints-rs
cd qhints-rs
cargo build --release

2. Run

./target/release/qhints-rs

A transparent overlay appears over the active window. Every detected UI element has a small yellow pill-shaped label on it, like QA, QS, QD.

3. Click something

Type the letters you see on the label. For example, if a terminal window shows QS on the close button, press Q then S. The cursor moves to that button and clicks.

  • The first letter of each label is red, the rest is dark gray.
  • As you type, matched letters turn green.
  • If no hint matches what you typed, the input resets automatically.
  • Press Escape to dismiss the overlay at any time.

4. Hover instead of click

Hold Ctrl while typing the last character — the cursor moves there but doesn’t click. Useful for tooltips and previews.

5. Try the other modes

If the overlay is up, press these keys to switch modes:

KeyModeWhat it does
(just type)NormalClick at element center
Ctrl + last keyHoverMove mouse, no click
AltDouble-clickType one hint → double-click
/Text selectPick two elements → select between them
ShiftDragPick source → pick destination → drag

What to try it on

  • A file manager — close a window, hover a tooltip
  • A text editor — select a word using / mode
  • A browser — click buttons, drag a tab

Verbose output

./target/release/qhints-rs -v      # debug logs
./target/release/qhints-rs -vv     # trace logs (very noisy)

Modes

Normal mode

Type a hint → click at element center.

ModifierEffect
(none)Click left button
Hold Ctrl on last keyHover (mousemove, no click)

Double-click mode

Press Alt (configurable double_click_key) to toggle. The next hint you type double-clicks that element. Mode auto-resets after one use.

Visual feedback: thicker label border when active.

Text selection mode

Press / (configurable text_select_key) to toggle. Select text between two elements.

Normal text selection

  1. Press / to enter text selection mode (blue borders appear on all elements)
  2. Type a hint to place the start marker (red vertical line)
  3. Type a second hint → selection is executed immediately:
    • Text words select from word-edge to word-edge
    • UI elements select from center to center

Advanced text selection

  1. Press / to enter text selection mode
  2. Type first hint → start marker placed
  3. Press / or Ctrl again to enter advanced mode (hints hide, only markers shown)
  4. Type second hint → end marker placed (orange vertical line)
  5. Fine-tune with controls:
KeyEffect
Arrow keysNudge active marker position
Shift + arrowLarger nudge
TabSwitch between start/end marker
EnterConfirm and execute selection

Drag mode

Press Shift (configurable drag_key) to toggle. Drag an element to a new position.

Basic drag

  1. Press Shift to enter drag mode (green borders appear)
  2. Type first hint → source element selected
  3. Type second hint → destination picked, drag executes immediately
    • Mouse moves from source center to destination center
    • Smooth interpolation at 8px steps

Fullscreen drag

After placing the source (step 2), press Shift again to re-scan the entire monitor. This lets you pick a destination anywhere on screen, not just in the current window.

Advanced drag

  1. Press Shift, place source marker
  2. Press Ctrl to enter advanced drag mode
  3. Type second hint → destination marker placed
  4. Arrow keys to nudge, Tab to switch, Enter to confirm

Hunt mode

With hunt: true in config, the overlay re-appears after every action so you can perform multiple operations in sequence.

  • Hold Ctrl during hunt to signal “this is the last one” — next action exits
  • Auto-dismisses after 10 seconds of inactivity

Mode combos

SequenceResult
Alt → hintDouble-click
/ → hint → hintSelect text between two elements
/ → hint → / → hint → (arrows) → EnterAdvanced text select
Shift → hint → hintDrag source to destination
Shift → hint → Shift → (pick anywhere) → hintFullscreen drag

Configuration

Config file at ~/.config/qhints/config.json (or $XDG_CONFIG_HOME/qhints/config.json).

All fields are optional. If a field is missing, the Rust default is used.

Quick example

{
  "backends": ["imageproc"],
  "first_key_zones": [
    ["q","w","e","r","t","y","u","i","o","p"],
    ["a","s","d","f","g","h","n","m","l"]
  ],
  "hints": {
    "hint_opacity": 0.85,
    "hint_font_size": 12
  },
  "dev": {
    "spotlight": true
  }
}

General settings

FieldTypeDefaultDescription
exit_keyinteger65307 (Escape)Keycode to dismiss the overlay
hover_modifierinteger4 (Ctrl)Modifier held on last key for hover
double_click_keyinteger65513 (Alt_L)Toggle double-click mode
text_select_keyinteger47 (/)Toggle text selection mode
drag_keyinteger65505 (Shift_L)Toggle drag mode
advanced_modifierinteger0Alternative key for advanced mode (0 = disabled, uses per-mode defaults)
overlay_x_offsetinteger0Horizontal offset for overlay position
overlay_y_offsetinteger0Vertical offset for overlay position
backendsarray of strings["imageproc"]Scanning backends in priority order
huntbooleanfalseRe-scan after every action
center_zone_paddingfloat or object0.2Fraction of screen excluded from center zone (uniform or {top,right,bottom,left})

Hint labels (first_key_zones)

Controls which keys occupy which screen zones. A ragged-row grid where each cell holds the first-character keys for that zone. Example:

[
  ["q","w","e","r","t","y","u","i","o","p"],
  ["a","s","d","f","g","h","j","k","l"],
  ["z","x","c","v","b","n","m"]
]

Rows are equal-height bands. Columns within a row are equal-width. Short rows’ last cell spans to fill the remaining width.

Characters (complementary_keys_alphabet)

{ "complementary_keys_alphabet": "qwertyuiopasdfghjklzxcvbnm" }

Characters used for the 2nd (and 3rd) characters in multi-char hint labels. All first-key zone characters must appear in this alphabet.

Hint appearance

All color fields support separate _r _g _b _a values (0.0–1.0) or hex strings:

{ "hints": { "hint_font": "#2a2a2a" } }

Hex formats: #RGB, #RGBA, #RRGGBB, #RRGGBBAA.

Label box

FieldDefaultDescription
hint_height20.0Label height in pixels
hint_width_padding10.0Horizontal padding inside label
hint_corner_radius6.0Rounded corner radius
hint_border_width1.0Border line width
hint_opacity1.0Global alpha multiplier for all visuals
hint_upercasetrueDisplay labels in uppercase

Background

FieldDefaultDescription
hint_background_r1.0Background red
hint_background_g0.95Background green
hint_background_b0.55Background blue
hint_background_a0.95Background alpha

Border

FieldDefaultDescription
hint_border_r0.78Border red
hint_border_g0.72Border green
hint_border_b0.36Border blue
hint_border_a1.0Border alpha

Text

FieldDefaultDescription
hint_font_size14.0Font size
hint_font_face"monospace"Font family
hint_font_r0.16Character color red
hint_font_g0.16Character color green
hint_font_b0.16Character color blue
hint_font_a1.0Character alpha

First character

FieldDefaultDescription
hint_first_font_r0.85First-char color red
hint_first_font_g0.1First-char color green
hint_first_font_b0.1First-char color blue
hint_first_font_a1.0First-char alpha
hint_first_font_size_boost0.0Extra font size for first char

Pressed (typed) state

FieldDefaultDescription
hint_pressed_font_r0.45Typed-char color red
hint_pressed_font_g0.75Typed-char color green
hint_pressed_font_b0.25Typed-char color blue
hint_pressed_font_a1.0Typed-char alpha

Shadow

FieldDefaultDescription
hint_shadowtrueEnable drop shadow
hint_shadow_r0.0Shadow color red
hint_shadow_g0.0Shadow color green
hint_shadow_b0.0Shadow color blue
hint_shadow_a0.3Shadow opacity
hint_shadow_offset_x1.0Shadow x offset
hint_shadow_offset_y1.0Shadow y offset

Text selection mode settings

FieldDefaultDescription
text_select_border_r0.0Border red in text selection mode
text_select_border_g0.6Border green
text_select_border_b1.0Border blue
text_select_border_a1.0Border alpha
text_select_padding_left0.0Left selection offset (fraction of element width)
text_select_padding_right0.0Right selection offset
text_select_advanced_key0Per-mode key to toggle advanced mode
text_select_nudge_step_x0.03Arrow nudge step X (fraction of element)
text_select_nudge_step_y0.03Arrow nudge step Y
text_select_nudge_step_shift_x0.15Shift+arrow nudge step X
text_select_nudge_step_shift_y0.15Shift+arrow nudge step Y
text_select_pulse_period_ms1200Marker pulse animation period
marker_pulse_interval_ms83Pulse redraw rate (~60 fps)
marker_bright_duration_ticks10Bright flash duration on marker placement
text_selection_show_boxestrueShow blue bounding boxes around all children

Drag mode settings

FieldDefaultDescription
drag_advanced_key0Per-mode key to toggle advanced drag
drag_delay_ms50Delay before mousedown/mouseup
drag_fullscreen_defaulttrueAuto-trigger fullscreen re-scan after source pick
drag_marker_shape"circle"Marker shape ("circle" or "square")
drag_marker_size4.0Marker radius/half-size
drag_show_boxestrueShow green bounding boxes around all children

Advanced mode

FieldDefaultDescription
advanced_border_extra_width0.25Extra border width in advanced mode

Overlap culling

FieldDefaultDescription
hint_overlap_threshold60.00 = show all, 100 = very aggressive culling

Dev options

{
  "dev": {
    "show_grid": false,
    "hunt": false,
    "hunt_timeout_ms": 300,
    "spotlight": false,
    "spotlight_opacity": 0.65,
    "spotlight_radius": 2.5,
    "advanced_spotlight_opacity": 0.4,
    "drag_spotlight_opacity": 0.4,
    "show_text_boxes": false,
    "show_bfs_boxes": false,
    "save_debug_images": false
  }
}
FieldDefaultDescription
show_gridfalseDraw zone grid boundaries
huntfalseRe-scan after every action
hunt_timeout_ms300Delay before re-scan
spotlightfalseDark overlay with holes around matching hints
spotlight_opacity0.65Darkness of spotlight overlay
spotlight_radius2.5Multiplier for spotlight hole radius
advanced_spotlight_opacity0.4Spotlight darkness in advanced selection mode
drag_spotlight_opacity0.4Spotlight darkness in drag mode
show_text_boxesfalseBlue bounding boxes around text words
show_bfs_boxesfalseRed bounding boxes around BFS components
save_debug_imagesfalseSave intermediate debug PNGs to /tmp/qhints_debug/

Application rules

Per-application overrides keyed by the application’s WM_CLASS name.

{
  "application_rules": {
    "firefox": {
      "canny_min_val": 20,
      "canny_max_val": 50,
      "kernel_size": 3
    },
    "default": {
      "canny_min_val": 15,
      "canny_max_val": 40,
      "kernel_size": 3
    }
  }
}
FieldDefaultDescription
scale_factor1.0HiDPI coordinate scale
detection_scale1.0Screenshot upscale before CV (1–4)
states[24, 25, 30]AT-SPI state filter (Sensitive, Showing, Visible)
states_match_type1 (ALL)Match type: 1=all, 3=none
rolesexcluded rolesAT-SPI role filter
roles_match_type3 (NONE)Match type: 1=all, 3=none
canny_min_val15Canny low threshold
canny_max_val40Canny high threshold
kernel_size3Dilation kernel
center_zone_padding0.2Per-rule center zone padding

Architecture

Data flow

main.rs ──→ backends (scan window → Vec<Child>)
         ──→ hints.rs (assign labels → HashMap<label, index>)
         ──→ overlay GTK window (capture keyboard, draw hints, return action)
         ──→ main.rs (execute xdotool command)

1. Scan

The focused window’s UI elements are detected through multiple backends:

  1. AT-SPI runs first (async D-Bus tree walk, 250ms hard deadline)
  2. Fallback backends run in configured order (imageproc, OCR)
  3. All results merge into a single Vec<Child>

2. Filter

  • Children smaller than 0.5% of screen dimensions are removed
  • Pairwise overlap culling removes duplicates, preferring Text over Element
  • Survivors are re-labeled with fresh short hints

3. Label

hints::get_hints() assigns keyboard labels using a spatial zone grid:

  1. Screen divided into zones based on first_key_zones config
  2. Children bucketed into their zone by position
  3. Overflow redistributes to neighbors, then globally
  4. Each zone gets single-char or multi-char labels

4. Show

A transparent GTK3 popup window is positioned over the target window. Cairo renders hint labels, markers, and overlays. Keyboard is grabbed.

5. Input

Key events feed through a state machine that tracks:

  • typed prefix buffer
  • Current mode (normal, text selection, drag, double-click)
  • Advanced mode sub-state
  • Marker positions and offsets

6. Act

A MouseAction is returned to main.rs and dispatched via xdotool:

  • click: xdotool mousemove X Y click 1
  • hover: xdotool mousemove X Y
  • drag: interpolated mousemove from source to destination
  • select: mousedown at start, mousemove to end, mouseup

Key data structures

Child {
    relative_position: (f64, f64),   // relative to window top-left
    absolute_position: (f64, f64),   // screen coordinates
    width: f64,
    height: f64,
    kind: ChildKind                  // Element | Text
}

MouseAction {
    action: String,                  // "click" | "hover" | "drag" | "select"
    x, y: i32,                       // primary coordinates
    end_x, end_y: i32,              // secondary coordinates (for drag/select)
    button: u32,                     // mouse button
    repeat: u32,                     // click count
    hunt_continue: bool,             // continue hunt loop?
    drag_fullscreen: bool            // trigger fullscreen re-scan?
}

ChildKind system

  • ChildKind::Text — word-level content from OCR or imageproc text detection
    • Gets blue border in text selection mode
    • Selection snaps to word edges
  • ChildKind::Element — UI components (buttons, icons, BFS components)
    • Normal border always
    • Selection uses element center (left-to-right)

Hint generation algorithm

first_key_zones (ragged grid: e.g. 10/9/7 columns)
        │
    map child (rx, ry) → (row, col) zone
        │
    per-zone capacity = len(zone_keys) × alphabet_len
        │
    overflow? → redistribute to neighbors (spatial)
      still overflow? → global redistribution
        │
    within capacity? → single-char keys
    within 2-char?   → first_key + alphabet_char
    else             → first_key + r1 + r2 (3-char)

Center-zone children get priority for shorter (single-char) labels. Periphery zones prefer 2-char to reserve short labels for center elements.

Overlap culling (two-pass)

Pass 1 (main.rs): Pairwise overlap on raw children before labeling

  • When overlap exceeds threshold: Text wins over Element, otherwise the larger child survives

Pass 2 (drawing.rs): On rendered hint label rectangles

  • Deterministic: keep the first (top-left) visible hint
  • Uses hint_overlap_threshold config (default 60%)

Threading

OperationThreadTimeout
X11 initSpawned thread2s
AT-SPI tree walkSpawned thread with tokio250ms
imageproc scanSpawned thread5s
OCR scanSpawned thread15s
GTK main loopMain thread
Pulse animationGLib timermarker_pulse_interval_ms

Note: Spawned threads are NOT cancelled on timeout (Rust limitation). Internal backends have their own timeouts, and the lock file prevents >2-3 orphaned threads.

Lock file

A lock file at /tmp/qhints.lock prevents multiple concurrent instances. Uses flock(LOCK_EX | LOCK_NB).

Backends

Backends are configured via the backends config field. All configured backends run in order and their results are merged.

Default: ["imageproc"]

{ "backends": ["atspi", "imageproc", "ocrs"] }

Merge pipeline

AT-SPI ──────┐
imageproc ───┤── merge ──→ filter tiny ──→ overlap cull ──→ label
ocrs ────────┘

When multiple backends produce overlapping children:

  1. Text references from fallback backends reclassify BFS components (ElementText) when overlap exceeds 95%
  2. Original backend Text references are discarded — only BFS components survive
  3. Pairwise overlap culling prefers Text over Element

AT-SPI (async D-Bus)

Connects to the system’s accessibility bus via atspi + zbus. Walks the accessibility tree of the focused window.

  • Queries roles, states, and geometry via batched async calls
  • Max recursion depth: 20
  • Max children per level: 500
  • Filters by application state (Sensitive + Showing + Visible)
  • Excludes roles: PANEL, FRAME, MENU_BAR, TOOL_BAR, LIST, SCROLL_PANE, TABLE, etc.
  • Timeout: 150ms tokio timeout + 250ms hard deadline

Text role detection

The following AT-SPI roles produce ChildKind::Text: Label(36), Text(74), DocumentText(87), Static(116), Paragraph(73), Heading(83)

Everything else becomes ChildKind::Element.

Requirements

systemctl --user status at-spi-dbus-bus.service

Some applications (VS Code, browsers) may not expose accessibility info unless launched with appropriate flags.

Imageproc (computer vision)

Screenshot-based detection. Runs when "imageproc" is in the backends list.

Pipeline

  1. Screenshot — X11 GetImage on the window region
  2. Grayscale conversion — dual pass: max-of-RGB (for edges) and weighted luminance (for text detection)
  3. Canny edge detection — configurable min/max thresholds and detection scale
  4. Text word detection — horizontal projection → text line bands → vertical projection per line → word segments
  5. Dilation — morphological dilation to connect nearby edges
  6. BFS — 4-direction flood fill → connected components
  7. Filter — remove components >50% of window
  8. Overlap analysis — BFS components overlapping text words by >95% are reclassified as Text

Output

  • BFS connected components → ChildKind::Element
  • Text line/word segments → ChildKind::Text
  • Debug images saved to /tmp/qhints_debug/ when dev.save_debug_images is enabled

Per-app tuning

{
  "application_rules": {
    "firefox": {
      "canny_min_val": 20,
      "canny_max_val": 50,
      "kernel_size": 3,
      "detection_scale": 1.0
    }
  }
}

Higher canny_min_val = fewer edges (sparser detection). Higher detection_scale = upscale before detection (more detail, slower).

Timeout: 5 seconds.

OCR (optional)

Feature-gated. Build with --features ocr to enable.

Uses ocrs + rten for text detection and recognition. Downloads pre-trained models (~35MB) from AWS S3 on first run to ~/.cache/qhints/ocrs/.

Note: Requires clang + libclang-dev system packages.

Pipeline

  1. Screenshot — same X11 GetImage as imageproc
  2. OCRocrs::OcrEngine detects word bounding boxes
  3. Word boxesChildKind::Text
  4. BFS gap-filling — Canny edge detection + dilation + BFS on the same screenshot to find non-text elements
  5. Filter — BFS components overlapping OCR words by >30% are removed
  6. Merge — BFS components appended first, then word boxes

Models

  • text-detection.rten — finds text regions
  • text-recognition.rten — recognizes characters (not used for positioning)

Timeout: 15 seconds.

Troubleshooting

Overlay doesn’t appear

Check you’re on X11:

echo $XDG_SESSION_TYPE  # should say "x11"

Verify AT-SPI is running:

systemctl --user status at-spi-dbus-bus.service

Run from a terminal with -v to see logs:

./qhints-rs -v

Keyboard input not reaching overlay

The keyboard grab can fail with some window managers. Click on the overlay to dismiss it (acts as safety net). Use -vv for trace-level logs.

No elements detected

Some applications (VS Code, browsers) don’t expose accessibility info. Try adding imageproc to your backends:

{ "backends": ["imageproc"] }

If that works but AT-SPI doesn’t, your app may need --force-renderer-accessibility or similar flags.

“qhints already running”

The lock file at /tmp/qhints.lock prevents concurrent instances. If qhints crashed, remove it:

rm /tmp/qhints.lock

Overlay on wrong monitor

Use overlay_x_offset and overlay_y_offset to shift. With i3 + multi-monitor, the overlay may need adjustments:

{
  "overlay_x_offset": 1920,
  "overlay_y_offset": 0
}

Can’t find config file

Config path: ~/.config/qhints/config.json. Create it if missing — all fields are optional.

Hints too large / too small

Adjust hint_font_size and hint_height:

{
  "hints": {
    "hint_font_size": 12,
    "hint_height": 18
  }
}

Selection/drag precision issues

Decrease nudge step values for finer control:

{
  "hints": {
    "text_select_nudge_step_x": 0.01,
    "text_select_nudge_step_y": 0.01
  }
}

Build fails

Ensure system deps are installed:

sudo apt install xdotool libgtk-3-dev librsvg2-dev clang libclang-dev

For the OCR feature (enabled by default), you need clang + libclang-dev. Disable OCR if you don’t need it:

cargo build --release --no-default-features

GPU/performance

The overlay uses software rendering via Cairo. On slower systems, increase marker_pulse_interval_ms to reduce redraw frequency:

{
  "hints": {
    "marker_pulse_interval_ms": 50
  }
}

Wayland

qhints-rs depends on:

  • x11rb — window geometry and focus tracking
  • xdotool — mouse input simulation
  • GTK3 overlay (may work under XWayland, untested)

Wayland support would require:

  • A Wayland client library for window info
  • ext-image-capture-src for screenshots
  • libei / ydotool for input

Contributions welcome.

src/child.rs

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum ChildKind {
    Element,
    Text,
}
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct Child {
    pub relative_position: (f64, f64),
    pub absolute_position: (f64, f64),
    pub width: f64,
    pub height: f64,
    pub kind: ChildKind,
}
}

src/config.rs

Constants

#![allow(unused)]
fn main() {
pub const ATSPI_STATE_SENSITIVE: i32 = 24;
pub const ATSPI_STATE_SHOWING: i32 = 25;
pub const ATSPI_STATE_VISIBLE: i32 = 30;
pub const ATSPI_MATCH_ALL: i32 = 1;
pub const ATSPI_MATCH_NONE: i32 = 3;
pub const EXCLUDED_ROLES: &[i32] = &[
    39, 85, 25, 23, 34, 63, 31, 38, 121, 49, 55, 99, 116, 83, 73, 123, 110, 20, 122
];
}

Private helpers

#![allow(unused)]
fn main() {
fn config_path() -> PathBuf
fn hex_to_rgba(hex: &str) -> Option<(f64, f64, f64, f64)>
}

Structs

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Deserialize)]
pub struct HintStyle {
    pub hint_height: f64,
    pub hint_width_padding: f64,
    pub hint_font_size: f64,
    pub hint_font_face: String,
    pub hint_font_r: f64,
    pub hint_font_g: f64,
    pub hint_font_b: f64,
    pub hint_font_a: f64,
    pub hint_first_font_r: f64,
    pub hint_first_font_g: f64,
    pub hint_first_font_b: f64,
    pub hint_first_font_a: f64,
    pub hint_first_font_size_boost: f64,
    pub hint_overlap_threshold: f64,
    pub hint_pressed_font_r: f64,
    pub hint_pressed_font_g: f64,
    pub hint_pressed_font_b: f64,
    pub hint_pressed_font_a: f64,
    pub hint_upercase: bool,
    pub hint_background_r: f64,
    pub hint_background_g: f64,
    pub hint_background_b: f64,
    pub hint_background_a: f64,
    pub hint_border_r: f64,
    pub hint_border_g: f64,
    pub hint_border_b: f64,
    pub hint_border_a: f64,
    pub hint_border_width: f64,
    pub hint_corner_radius: f64,
    pub text_select_border_r: f64,
    pub text_select_border_g: f64,
    pub text_select_border_b: f64,
    pub text_select_border_a: f64,
    pub text_select_padding_left: f64,
    pub text_select_padding_right: f64,
    pub text_select_advanced_key: u32,
    pub drag_advanced_key: u32,
    pub text_select_nudge_step_x: f64,
    pub text_select_nudge_step_y: f64,
    pub text_select_nudge_step_shift_x: f64,
    pub text_select_nudge_step_shift_y: f64,
    pub drag_fullscreen_default: bool,
    pub drag_delay_ms: u64,
    pub text_select_pulse_period_ms: u64,
    pub marker_pulse_interval_ms: u64,
    pub marker_bright_duration_ticks: u32,
    pub advanced_border_extra_width: f64,
    pub drag_marker_shape: String,
    pub drag_marker_size: f64,
    pub hint_shadow: bool,
    pub hint_shadow_r: f64,
    pub hint_shadow_g: f64,
    pub hint_shadow_b: f64,
    pub hint_shadow_a: f64,
    pub hint_shadow_offset_x: f64,
    pub hint_shadow_offset_y: f64,
    pub text_selection_show_boxes: bool,
    pub drag_show_boxes: bool,
    pub hint_opacity: f64,
}

impl Default for HintStyle
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy)]
pub struct ZonePadding {
    pub top: f64,
    pub right: f64,
    pub bottom: f64,
    pub left: f64,
}

impl ZonePadding {
    pub fn uniform(pad: f64) -> Self
}
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct DevOptions {
    pub show_grid: bool,
    pub hunt: bool,
    pub hunt_timeout_ms: u32,
    pub spotlight: bool,
    pub spotlight_opacity: f64,
    pub spotlight_radius: f64,
    pub advanced_spotlight_opacity: f64,
    pub drag_spotlight_opacity: f64,
    pub show_text_boxes: bool,
    pub show_bfs_boxes: bool,
    pub save_debug_images: bool,
}

impl Default for DevOptions
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct ApplicationRule {
    pub scale_factor: f64,
    pub detection_scale: f64,
    pub states: Vec<i32>,
    pub states_match_type: i32,
    pub roles: Vec<i32>,
    pub roles_match_type: i32,
    pub canny_min_val: i32,
    pub canny_max_val: i32,
    pub kernel_size: i32,
    pub center_zone_padding: ZonePadding,
}

impl Default for ApplicationRule
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct Config {
    pub hints: HintStyle,
    pub complementary_keys_alphabet: String,
    pub exit_key: u32,
    pub hover_modifier: u32,
    pub double_click_key: u32,
    pub advanced_modifier: u32,
    pub drag_key: u32,
    pub text_select_key: u32,
    pub overlay_x_offset: i32,
    pub overlay_y_offset: i32,
    pub application_rules: HashMap<String, ApplicationRule>,
    pub backends: Vec<String>,
    pub first_key_zones: Vec<Vec<String>>,
    pub center_zone_padding: ZonePadding,
    pub dev: DevOptions,
}

impl Default for Config
}

Public functions

#![allow(unused)]
fn main() {
pub fn load_config() -> Config
}

Loads config from $XDG_CONFIG_HOME/qhints/config.json, merging over Config::default(). If the file doesn’t exist, returns defaults.

Private functions

#![allow(unused)]
fn main() {
fn merge_user_config(config: &mut Config, json: &serde_json::Value)
}

Recursive merge of user JSON into Config. Handles all top-level fields, hints.*, dev.*, backends, application_rules, first_key_zones, and center_zone_padding.

src/hints.rs

Public functions

#![allow(unused)]
fn main() {
pub fn get_hints(
    children: &[Child],
    complementary_keys_alphabet: &str,
    first_key_zones: &[Vec<String>],
    center_zone_padding: &config::ZonePadding,
    window_size: Option<(f64, f64)>,
) -> HashMap<String, usize>
}

Returns a map of hint label string → child index.

Private functions

#![allow(unused)]
fn main() {
fn get_zone(
    rx: f64,
    ry: f64,
    width: f64,
    height: f64,
    rows: usize,
    col_counts: &[usize],
) -> (usize, usize)
}

Normalizes (rx, ry) by (width, height) and maps to ragged grid (row, col).

#![allow(unused)]
fn main() {
fn neighbors(
    r: usize,
    c: usize,
    rows: usize,
    col_counts: &[usize],
) -> Vec<(usize, usize)>
}

Returns all 8-directional adjacent zones that exist (respecting ragged rows), sorted by Euclidean distance from (r, c).

#![allow(unused)]
fn main() {
fn generate_product(
    chars: &[char],
    repeat: usize,
    out: &mut Vec<String>,
)
}

Recursive cartesian product. Generates all possible Strings of length repeat from chars. Used as fallback when no window_size is available.

src/mouse.rs

Note: Unused. All mouse dispatch is inlined in main.rs.

#![allow(unused)]
fn main() {
pub fn click(
    x: i32,
    y: i32,
    button: u32,
    repeat: u32,
) -> Result<(), Box<dyn std::error::Error>>
}

Shells out to xdotool mousemove X Y click BUTTON, repeated repeat times.

src/main.rs

Module declarations

#![allow(unused)]
fn main() {
mod backend;
mod child;
mod config;
mod hints;
mod mouse;
mod overlay;
mod window_system;
}

CLI

#![allow(unused)]
fn main() {
#[derive(Parser)]
#[command(name = "qhints-rs", about = "Keyboard-driven UI navigation for Linux")]
struct Cli {
    #[arg(short, long, default_value = "hint")]
    mode: String,

    #[arg(short, long, action = clap::ArgAction::Count)]
    verbose: u8,
}
}

Public functions

fn main()
  1. Parse CLI args
  2. Set log level: 0=Info, 1=Debug, 2+=Trace
  3. Load config
  4. Dispatch: "hint"hint_mode(), "scroll" → warn not implemented

Private functions

#![allow(unused)]
fn main() {
fn try_acquire_lock() -> Option<std::fs::File>
}

Creates /tmp/qhints.lock and tries flock(LOCK_EX | LOCK_NB). Returns Some(file) on success, None if another instance holds the lock.

#![allow(unused)]
fn main() {
fn with_thread_timeout<T: Send + 'static>(
    f: impl FnOnce() -> T + Send + 'static,
    timeout: std::time::Duration,
    label: &'static str,
) -> Option<T>
}

Spawns a thread for f(), waits up to timeout for a result via mpsc::channel. Returns Some(T) on success, None on timeout or thread panic.

Thread is not cancelled on timeout. Backends have internal timeouts and the lock file prevents >2-3 orphaned threads.

#![allow(unused)]
fn main() {
fn hint_mode(config: &config::Config, total_start: Instant)
}

Flow

  1. Lock: try_acquire_lock() — exit if already running
  2. X11 init: X11::new() via with_thread_timeout(2s)
  3. App rule: Lookup ApplicationRule by win_info.app_name, fallback to "default"

Hunt loop (repeat until done)

A. AT-SPI scan (async threaded)

tokio::time::timeout(150ms, async { AtspiBackend::new(...).get_children().await })
  → rx.recv_timeout(250ms)
  → Vec<Child> or empty

B. Fallback backends (in config order, skip “atspi”)

  • "imageproc": with_thread_timeout(|| imageproc::get_children(), 5s)
  • "ocrs": with_thread_timeout(|| ocrs::get_children(), 15s)

C. Merge fallback text references

  1. Find Text indices in fallback_children
  2. Reclassify ElementText where overlap > 95%
  3. Discard original backend Text references, keep only BFS

D. Filter tiny children

Remove children < 0.5% of screen (min 3px on each axis)

E. Initial labeling

hints::get_hints(&children, &alphabet, &zones, &padding, Some((w, h)))

F. Overlap culling & relabeling

Threshold: (100 - hint_overlap_threshold) / 100

Pairwise comparison:

  • Text wins over Element
  • Otherwise smaller child is culled

If any culled → re-label survivors with fresh short hints

G. Show overlay

overlay::show_overlay(config, &hint_map, &children, x, y, w, h, None)
  → Option<MouseAction>

H. Dispatch action

action.actionxdotool command
"click"mousemove X Y [click BUTTON] × repeat
"hover"mousemove X Y
"drag"mousemove SX SY; sleep; mousedown; steps(8px); sleep; mouseup
"select"mousemove SX SY; sleep; mousedown; mousemove EX EY; sleep; mouseup

If drag_fullscreen:

  1. Get screen size
  2. Run all non-atspi backends on full screen
  3. Show second overlay with preset_drag_source
  4. Execute drag with interpolation

If hunt_continue: sleep(hunt_timeout_ms), continue loop Otherwise: break

src/overlay/

src/overlay/mod.rs

Public types

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum ActiveHook {
    Start,
    End,
}
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct MouseAction {
    pub action: String,
    pub x: i32,
    pub y: i32,
    pub end_x: i32,
    pub end_y: i32,
    pub button: u32,
    pub repeat: u32,
    pub hunt_continue: bool,
    pub drag_fullscreen: bool,
}
}

Private state

#![allow(unused)]
fn main() {
struct OverlayState {
    config: Config,
    hints: HashMap<String, usize>,
    children: Vec<Child>,
    typed: String,
    mouse_action: Rc<RefCell<Option<MouseAction>>>,
    window_size: (f64, f64),
    window_origin: (i32, i32),
    hunt: bool,
    hunt_exit_next: bool,
    text_selection_mode: bool,
    advanced_mode: bool,
    active_hook: ActiveHook,
    selection_start_child: Option<usize>,
    selection_start_offset_x: f64,
    selection_start_offset_y: f64,
    selection_end_child: Option<usize>,
    selection_end_offset_x: f64,
    selection_end_offset_y: f64,
    consumed_hints: Vec<usize>,
    double_click_mode: bool,
    drag_mode: bool,
    pulse_bright_remaining: u32,
    drag_advanced_mode: bool,
    drag_source_pos: Option<(f64, f64)>,
    drag_source_size: (f64, f64),
    drag_dest_child: Option<usize>,
    drag_source_offset_x: f64,
    drag_source_offset_y: f64,
    drag_dest_offset_x: f64,
    drag_dest_offset_y: f64,
}
}

Public functions

#![allow(unused)]
fn main() {
pub fn show_overlay(
    config: &Config,
    hints: &HashMap<String, usize>,
    children: &[Child],
    x: i32,
    y: i32,
    width: i32,
    height: i32,
    preset_drag_source: Option<(i32, i32)>,
) -> Option<MouseAction>
}

Creates a transparent GTK3 popup, draws hints, captures keyboard, runs gtk::main(). Returns MouseAction on user input, None if dismissed.

Private functions

#![allow(unused)]
fn main() {
fn select_position(
    child: &Child,
    start: bool,
    pad_left: f64,
    pad_right: f64,
) -> (i32, i32)
}

Computes screen coordinates for selection action:

  • start = true: child.left - pad_left × width, child.top + height / 2
  • start = false: child.right + pad_right × width, child.top + height / 2

Overlay setup

  1. gtk::init()
  2. Create gtk::Window::new(Popup):
    • app_paintable = true, decorated = false
    • skip_taskbar_hint = true, skip_pager_hint = true
    • accept_focus = false, can_focus = false
    • type_hint = Notification
    • RGBA visual for transparency
  3. Add DrawingArea for cairo rendering
  4. Initialize OverlayState
  5. window.realize() then:
    • set_override_redirect(true)
    • move_resize(x + overlay_x_offset, y + overlay_y_offset, width, height)
  6. Connect signals: draw, key press, button press, show, destroy

Timers

TimerDurationTrigger
Hunt idle10s inactivityDismiss overlay
Safety5s (extended in selection/drag)Force main_quit
Pulsemarker_pulse_interval_ms (16-500ms)queue_draw if markers visible

Key event state machine

1. Escape         → dismiss overlay
2. Ctrl (hunt)    → set hunt_exit_next = true
3. text_select_key (/) → toggle text selection mode
4. text_select_key again with start placed → toggle advanced mode
5. double_click_key (Alt) → toggle double-click mode
6. Shift after source placed → fullscreen drag re-scan
7. drag_key (Shift) → toggle drag mode
8. advanced_modifier / per-mode key → toggle advanced/drag_advanced
9. Tab → switch active hook (Start ↔ End)
10. Enter (drag advanced) → confirm drag, dismiss
11. Enter (advanced select) → confirm selection, dismiss
12. Arrow keys → nudge active hook position
13. Unicode char → append to typed, match against hints
    - Exact match → set end marker (selection) / fire action (normal/drag)
    - Prefix match → redraw with matching hints highlighted
    - No match → clear typed

src/overlay/drawing.rs

Public functions

#![allow(unused)]
fn main() {
pub fn draw_hints(
    cr: &Context,
    config: &Config,
    hints: &HashMap<String, usize>,
    children: &[Child],
    typed: &str,
    consumed_hints: &[usize],
    text_selection_mode: bool,
    selection_start_child: Option<usize>,
    selection_start_offset_x: f64,
    selection_start_offset_y: f64,
    selection_end_child: Option<usize>,
    selection_end_offset_x: f64,
    selection_end_offset_y: f64,
    advanced_mode: bool,
    active_hook: ActiveHook,
    double_click_mode: bool,
    drag_mode: bool,
    drag_advanced_mode: bool,
    drag_source_pos: Option<(f64, f64)>,
    drag_source_size: (f64, f64),
    drag_source_offset_x: f64,
    drag_source_offset_y: f64,
    drag_dest_child: Option<usize>,
    drag_dest_offset_x: f64,
    drag_dest_offset_y: f64,
    window_origin: (i32, i32),
    pulse_bright_remaining: u32,
    marker_bright_duration_ticks: u32,
    drag_marker_square: bool,
    drag_marker_size: f64,
    show_text_boxes: bool,
    show_bfs_boxes: bool,
    text_selection_show_boxes: bool,
    drag_show_boxes: bool,
    window_size: (f64, f64),
)
}

Drawing order

  1. Clear to transparent
  2. Hint label boxes (shadow → background → border → per-character text)
  3. Spotlight radial gradient holes (around matching hints)
  4. Spotlight selection rectangle (between markers in advanced mode)
  5. Text selection markers (start: red, end: orange)
  6. Drag markers (source: red dot, dest: green dot)
  7. Text selection bounding boxes (blue)
  8. Drag mode bounding boxes (green)
  9. Dev debug: BFS component boxes (red borders)
  10. Dev debug: Text word boxes (blue borders)
  11. Dev debug: Zone grid boundaries

Private functions

#![allow(unused)]
fn main() {
fn draw_rounded_rect(
    cr: &Context,
    x: f64,
    y: f64,
    w: f64,
    h: f64,
    r: f64,
)
}

Draws a rounded rectangle path. r clamped to min(w/2, h/2).

#![allow(unused)]
fn main() {
fn overlap_fraction(
    a: (f64, f64, f64, f64),
    b: (f64, f64, f64, f64),
) -> f64
}

Returns intersection_area / min(area_a, area_b). Used for hint label overlap culling.

src/window_system/

src/window_system/mod.rs

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct WindowInfo {
    pub extents: (i32, i32, i32, i32),
    pub pid: u32,
    pub app_name: String,
}
}
#![allow(unused)]
fn main() {
pub trait WindowSystem {
    fn focused_window(&self) -> &WindowInfo;
}
}

src/window_system/x11.rs

#![allow(unused)]
fn main() {
pub struct X11 {
    info: WindowInfo,
}

impl X11 {
    pub fn new() -> Result<Self, Box<dyn std::error::Error>>
}

impl WindowSystem for X11 {
    fn focused_window(&self) -> &WindowInfo
}
}

X11::new() sequence

  1. RustConnection::connect(None) — open X11 display
  2. Intern atoms: _NET_ACTIVE_WINDOW, _NET_WM_PID, WM_CLASS
  3. get_property(_NET_ACTIVE_WINDOW, root) → active window ID
  4. get_geometry(active_win) → window size
  5. translate_coordinates(active_win, root, 0, 0) → absolute position
  6. get_property(_NET_WM_PID) → PID
  7. get_property(WM_CLASS) → first null-terminated string as app_name
#![allow(unused)]
fn main() {
pub fn screen_size() -> Result<(i32, i32), Box<dyn std::error::Error>>
}

Connects to X11, returns (width_in_pixels, height_in_pixels) of root window.

src/backend/

src/backend/mod.rs

#![allow(unused)]
fn main() {
pub mod atspi;
pub mod imageproc;
#[cfg(feature = "ocr")]
pub mod ocrs;
}

src/backend/atspi.rs

Async D-Bus accessibility tree walker.

#![allow(unused)]
fn main() {
pub struct AtspiBackend {
    conn: Connection,
    rule: ApplicationRule,
    window_info: WindowInfo,
    scale_factor: f64,
}

impl AtspiBackend {
    pub async fn new(
        window_info: WindowInfo,
        rule: ApplicationRule,
    ) -> Result<Self, Box<dyn std::error::Error>>

    pub async fn get_children(
        &self,
    ) -> Result<Vec<Child>, Box<dyn std::error::Error>>
}
}

get_children() flow

  1. find_active_window() — iterate D-Bus tree for app → window matching PID + Active state
  2. walk_children(proxy, &mut children, 0) — recursive, max depth 20, max 500/level
    • Batch-fetch children via futures::join_all
    • Query role, state, extents via tokio::join!
    • Filter: ALL of (Sensitive, Showing, Visible) AND NONE of EXCLUDED_ROLES
    • Role → ChildKind::Text if is_text_role(), else Element
    • Recurse via Box::pin(self.walk_children(...))

Private methods

#![allow(unused)]
fn main() {
async fn find_active_window(
    &self,
) -> Result<Option<AccessibleProxy<'_>>, zbus::Error>

async fn walk_children(
    &self,
    proxy: &AccessibleProxy<'_>,
    children: &mut Vec<Child>,
    depth: usize,
) -> Result<(), Box<dyn std::error::Error>>

fn is_text_role(role: i32) -> bool
}

is_text_role() matches

Role IDName
36Label
74Text
87DocumentText
116Static
73Paragraph
83Heading

src/backend/imageproc.rs

Computer vision: screenshot → edge detection → BFS → text lines.

Statics

#![allow(unused)]
fn main() {
pub static DEBUG_BFS_COMPONENTS: Mutex<Vec<Child>>
pub static SAVE_DEBUG_IMAGES: AtomicBool
}

Public functions

#![allow(unused)]
fn main() {
pub fn get_children(
    window_info: &WindowInfo,
    rule: &ApplicationRule,
) -> Result<Vec<Child>, Box<dyn std::error::Error>>
}

Pipeline

  1. Screenshot: X11 GetImage(Z_PIXMAP, root, x, y, w, h) → BGRA buffer
  2. Dual grayscale (single pass over BGRA):
    • luma: 0.299R + 0.587G + 0.114B — text detection
    • process_img: max(R, G, B) — preserves color edge contrast
  3. Canny edge detection on process_img:
    • Resize by detection_scale (Nearest neighbor)
    • imageproc::edges::canny(src, canny_min_val, canny_max_val)
  4. Text word detection (detect_text_words):
    • Resize luma by same scale
    • Horizontal projection → text line bands (threshold 0.5% width, min 8px, max gap 2% height)
    • Vertical projection per line → word segments (gap threshold 25% line-height, min 3px gap, min 4px word)
    • Scale back by inv_scale
  5. Dilation: imageproc::morphology::dilate(edges, LInf, kernel_size/2)
  6. BFS: 4-direction flood fill → connected components as ChildKind::Element
  7. Filter: Remove components >50% of window
  8. Overlap analysis:
    • BFS overlapping text words >95% → ChildKind::Text
    • Words overlapping 2+ BFS → added as separate Text
  9. Debug images: Saved to /tmp/qhints_debug/ if SAVE_DEBUG_IMAGES

Private functions

#![allow(unused)]
fn main() {
fn detect_text_words(
    edges: &GrayImage,
    _luma: &GrayImage,
    img_w: u32,
    img_h: u32,
    win_x: u32,
    win_y: u32,
) -> Vec<Child>

fn draw_boxes(
    luma: &GrayImage,
    words: &[Child],
    all_bfs: &[Child],
    kept: &[Child],
    path: &Path,
) -> Result<(), Box<dyn std::error::Error>>
}

src/backend/ocrs.rs (feature-gated)

OCR-based detection via ocrs + rten.

Constants

#![allow(unused)]
fn main() {
const DETECTION_MODEL: &str = "https://ocrs-models.s3-accelerate.amazonaws.com/text-detection.rten";
const RECOGNITION_MODEL: &str = "https://ocrs-models.s3-accelerate.amazonaws.com/text-recognition.rten";
}

Public functions

#![allow(unused)]
fn main() {
pub fn get_children(
    window_info: &WindowInfo,
    rule: &ApplicationRule,
) -> Result<Vec<Child>, Box<dyn std::error::Error>>
}

Pipeline

  1. Screenshot: X11 GetImage (same as imageproc)
  2. RGB conversion: BGRA → flat RGB array; also build luma for BFS
  3. Model download: Download models to ~/.cache/qhints/ocrs/
    • Skips if already cached
  4. OCR: ocrs::OcrEngine::new with detection + recognition models
    • engine.prepare_input(ImageSource)ocr_input
    • engine.detect_words(&ocr_input) → word bounding boxes as ChildKind::Text
  5. BFS gap-filling: Canny + dilation + BFS on luma → ChildKind::Element
  6. Filter: Remove BFS components overlapping OCR words >30%
  7. Merge: BFS first, then word boxes appended

Private functions

#![allow(unused)]
fn main() {
fn cache_dir() -> Result<PathBuf, Box<dyn std::error::Error>>

fn download_model(
    url: &str,
    path: &Path,
) -> Result<(), Box<dyn std::error::Error>>
}