qhints-rs
qhints-rs scans UI elements on your screen, assigns each one a keyboard label (a “hint”), and lets you interact with them by typing — no mouse required.
Type a label → click. Hold Ctrl → hover. Press / → select text. Press Shift → drag.

- X11 only (Wayland not yet supported)
- Rust rewrite of qhints
- ~3200 lines of Rust
- Daily-driven on i3wm with picom
Demo: https://youtu.be/BWC7h5dmkI4
Repo: https://github.com/smllb/qhints-rs
Installation
System requirements
- X11 session (not Wayland)
- AT-SPI D-Bus service — usually ships with your desktop. Verify:
systemctl --user status at-spi-dbus-bus.service - Rust toolchain (install via rustup):
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh - System packages:
sudo apt install xdotool libgtk-3-dev librsvg2-dev - For the OCR backend:
sudo apt install clang libclang-dev - A compositor recommended (tested with picom on i3)
Build
git clone https://github.com/smllb/qhints-rs
cd qhints-rs
cargo build --release
Binary at target/release/qhints-rs.
For OCR support (optional):
cargo build --release --features ocr
Note: The OCR backend downloads ~35MB of models (text-detection.rten, text-recognition.rten) from AWS S3 on first run to ~/.cache/qhints/ocrs/.
Or install directly from git:
cargo install --git https://github.com/smllb/qhints-rs
Keybinding
Bind to a shortcut in your WM config. i3 example:
bindsym ctrl+shift+p exec --no-startup-id /home/you/qhints-rs/target/release/qhints-rs
You can also use the wrapper script at scripts/run-qhints.sh (logs to syslog).
Quick Start
1. Build
git clone https://github.com/smllb/qhints-rs
cd qhints-rs
cargo build --release
2. Run
./target/release/qhints-rs
A transparent overlay appears over the active window. Every detected UI element has a small yellow pill-shaped label on it, like QA, QS, QD.
3. Click something
Type the letters you see on the label. For example, if a terminal window shows QS on the close button, press Q then S. The cursor moves to that button and clicks.
- The first letter of each label is red, the rest is dark gray.
- As you type, matched letters turn green.
- If no hint matches what you typed, the input resets automatically.
- Press Escape to dismiss the overlay at any time.
4. Hover instead of click
Hold Ctrl while typing the last character — the cursor moves there but doesn’t click. Useful for tooltips and previews.
5. Try the other modes
If the overlay is up, press these keys to switch modes:
| Key | Mode | What it does |
|---|---|---|
| (just type) | Normal | Click at element center |
| Ctrl + last key | Hover | Move mouse, no click |
| Alt | Double-click | Type one hint → double-click |
| / | Text select | Pick two elements → select between them |
| Shift | Drag | Pick source → pick destination → drag |
What to try it on
- A file manager — close a window, hover a tooltip
- A text editor — select a word using
/mode - A browser — click buttons, drag a tab
Verbose output
./target/release/qhints-rs -v # debug logs
./target/release/qhints-rs -vv # trace logs (very noisy)
Modes
Normal mode
Type a hint → click at element center.
| Modifier | Effect |
|---|---|
| (none) | Click left button |
| Hold Ctrl on last key | Hover (mousemove, no click) |
Double-click mode
Press Alt (configurable double_click_key) to toggle. The next hint you type
double-clicks that element. Mode auto-resets after one use.
Visual feedback: thicker label border when active.
Text selection mode
Press / (configurable text_select_key) to toggle. Select text between two
elements.
Normal text selection
- Press
/to enter text selection mode (blue borders appear on all elements) - Type a hint to place the start marker (red vertical line)
- Type a second hint → selection is executed immediately:
- Text words select from word-edge to word-edge
- UI elements select from center to center
Advanced text selection
- Press
/to enter text selection mode - Type first hint → start marker placed
- Press
/or Ctrl again to enter advanced mode (hints hide, only markers shown) - Type second hint → end marker placed (orange vertical line)
- Fine-tune with controls:
| Key | Effect |
|---|---|
| Arrow keys | Nudge active marker position |
| Shift + arrow | Larger nudge |
| Tab | Switch between start/end marker |
| Enter | Confirm and execute selection |
Drag mode
Press Shift (configurable drag_key) to toggle. Drag an element to a new position.
Basic drag
- Press Shift to enter drag mode (green borders appear)
- Type first hint → source element selected
- Type second hint → destination picked, drag executes immediately
- Mouse moves from source center to destination center
- Smooth interpolation at 8px steps
Fullscreen drag
After placing the source (step 2), press Shift again to re-scan the entire monitor. This lets you pick a destination anywhere on screen, not just in the current window.
Advanced drag
- Press Shift, place source marker
- Press Ctrl to enter advanced drag mode
- Type second hint → destination marker placed
- Arrow keys to nudge, Tab to switch, Enter to confirm
Hunt mode
With hunt: true in config, the overlay re-appears after every action so you can
perform multiple operations in sequence.
- Hold Ctrl during hunt to signal “this is the last one” — next action exits
- Auto-dismisses after 10 seconds of inactivity
Mode combos
| Sequence | Result |
|---|---|
| Alt → hint | Double-click |
/ → hint → hint | Select text between two elements |
/ → hint → / → hint → (arrows) → Enter | Advanced text select |
| Shift → hint → hint | Drag source to destination |
| Shift → hint → Shift → (pick anywhere) → hint | Fullscreen drag |
Configuration
Config file at ~/.config/qhints/config.json (or $XDG_CONFIG_HOME/qhints/config.json).
All fields are optional. If a field is missing, the Rust default is used.
Quick example
{
"backends": ["imageproc"],
"first_key_zones": [
["q","w","e","r","t","y","u","i","o","p"],
["a","s","d","f","g","h","n","m","l"]
],
"hints": {
"hint_opacity": 0.85,
"hint_font_size": 12
},
"dev": {
"spotlight": true
}
}
General settings
| Field | Type | Default | Description |
|---|---|---|---|
exit_key | integer | 65307 (Escape) | Keycode to dismiss the overlay |
hover_modifier | integer | 4 (Ctrl) | Modifier held on last key for hover |
double_click_key | integer | 65513 (Alt_L) | Toggle double-click mode |
text_select_key | integer | 47 (/) | Toggle text selection mode |
drag_key | integer | 65505 (Shift_L) | Toggle drag mode |
advanced_modifier | integer | 0 | Alternative key for advanced mode (0 = disabled, uses per-mode defaults) |
overlay_x_offset | integer | 0 | Horizontal offset for overlay position |
overlay_y_offset | integer | 0 | Vertical offset for overlay position |
backends | array of strings | ["imageproc"] | Scanning backends in priority order |
hunt | boolean | false | Re-scan after every action |
center_zone_padding | float or object | 0.2 | Fraction of screen excluded from center zone (uniform or {top,right,bottom,left}) |
Hint labels (first_key_zones)
Controls which keys occupy which screen zones. A ragged-row grid where each cell holds the first-character keys for that zone. Example:
[
["q","w","e","r","t","y","u","i","o","p"],
["a","s","d","f","g","h","j","k","l"],
["z","x","c","v","b","n","m"]
]
Rows are equal-height bands. Columns within a row are equal-width. Short rows’ last cell spans to fill the remaining width.
Characters (complementary_keys_alphabet)
{ "complementary_keys_alphabet": "qwertyuiopasdfghjklzxcvbnm" }
Characters used for the 2nd (and 3rd) characters in multi-char hint labels. All first-key zone characters must appear in this alphabet.
Hint appearance
All color fields support separate _r _g _b _a values (0.0–1.0) or hex strings:
{ "hints": { "hint_font": "#2a2a2a" } }
Hex formats: #RGB, #RGBA, #RRGGBB, #RRGGBBAA.
Label box
| Field | Default | Description |
|---|---|---|
hint_height | 20.0 | Label height in pixels |
hint_width_padding | 10.0 | Horizontal padding inside label |
hint_corner_radius | 6.0 | Rounded corner radius |
hint_border_width | 1.0 | Border line width |
hint_opacity | 1.0 | Global alpha multiplier for all visuals |
hint_upercase | true | Display labels in uppercase |
Background
| Field | Default | Description |
|---|---|---|
hint_background_r | 1.0 | Background red |
hint_background_g | 0.95 | Background green |
hint_background_b | 0.55 | Background blue |
hint_background_a | 0.95 | Background alpha |
Border
| Field | Default | Description |
|---|---|---|
hint_border_r | 0.78 | Border red |
hint_border_g | 0.72 | Border green |
hint_border_b | 0.36 | Border blue |
hint_border_a | 1.0 | Border alpha |
Text
| Field | Default | Description |
|---|---|---|
hint_font_size | 14.0 | Font size |
hint_font_face | "monospace" | Font family |
hint_font_r | 0.16 | Character color red |
hint_font_g | 0.16 | Character color green |
hint_font_b | 0.16 | Character color blue |
hint_font_a | 1.0 | Character alpha |
First character
| Field | Default | Description |
|---|---|---|
hint_first_font_r | 0.85 | First-char color red |
hint_first_font_g | 0.1 | First-char color green |
hint_first_font_b | 0.1 | First-char color blue |
hint_first_font_a | 1.0 | First-char alpha |
hint_first_font_size_boost | 0.0 | Extra font size for first char |
Pressed (typed) state
| Field | Default | Description |
|---|---|---|
hint_pressed_font_r | 0.45 | Typed-char color red |
hint_pressed_font_g | 0.75 | Typed-char color green |
hint_pressed_font_b | 0.25 | Typed-char color blue |
hint_pressed_font_a | 1.0 | Typed-char alpha |
Shadow
| Field | Default | Description |
|---|---|---|
hint_shadow | true | Enable drop shadow |
hint_shadow_r | 0.0 | Shadow color red |
hint_shadow_g | 0.0 | Shadow color green |
hint_shadow_b | 0.0 | Shadow color blue |
hint_shadow_a | 0.3 | Shadow opacity |
hint_shadow_offset_x | 1.0 | Shadow x offset |
hint_shadow_offset_y | 1.0 | Shadow y offset |
Text selection mode settings
| Field | Default | Description |
|---|---|---|
text_select_border_r | 0.0 | Border red in text selection mode |
text_select_border_g | 0.6 | Border green |
text_select_border_b | 1.0 | Border blue |
text_select_border_a | 1.0 | Border alpha |
text_select_padding_left | 0.0 | Left selection offset (fraction of element width) |
text_select_padding_right | 0.0 | Right selection offset |
text_select_advanced_key | 0 | Per-mode key to toggle advanced mode |
text_select_nudge_step_x | 0.03 | Arrow nudge step X (fraction of element) |
text_select_nudge_step_y | 0.03 | Arrow nudge step Y |
text_select_nudge_step_shift_x | 0.15 | Shift+arrow nudge step X |
text_select_nudge_step_shift_y | 0.15 | Shift+arrow nudge step Y |
text_select_pulse_period_ms | 1200 | Marker pulse animation period |
marker_pulse_interval_ms | 83 | Pulse redraw rate (~60 fps) |
marker_bright_duration_ticks | 10 | Bright flash duration on marker placement |
text_selection_show_boxes | true | Show blue bounding boxes around all children |
Drag mode settings
| Field | Default | Description |
|---|---|---|
drag_advanced_key | 0 | Per-mode key to toggle advanced drag |
drag_delay_ms | 50 | Delay before mousedown/mouseup |
drag_fullscreen_default | true | Auto-trigger fullscreen re-scan after source pick |
drag_marker_shape | "circle" | Marker shape ("circle" or "square") |
drag_marker_size | 4.0 | Marker radius/half-size |
drag_show_boxes | true | Show green bounding boxes around all children |
Advanced mode
| Field | Default | Description |
|---|---|---|
advanced_border_extra_width | 0.25 | Extra border width in advanced mode |
Overlap culling
| Field | Default | Description |
|---|---|---|
hint_overlap_threshold | 60.0 | 0 = show all, 100 = very aggressive culling |
Dev options
{
"dev": {
"show_grid": false,
"hunt": false,
"hunt_timeout_ms": 300,
"spotlight": false,
"spotlight_opacity": 0.65,
"spotlight_radius": 2.5,
"advanced_spotlight_opacity": 0.4,
"drag_spotlight_opacity": 0.4,
"show_text_boxes": false,
"show_bfs_boxes": false,
"save_debug_images": false
}
}
| Field | Default | Description |
|---|---|---|
show_grid | false | Draw zone grid boundaries |
hunt | false | Re-scan after every action |
hunt_timeout_ms | 300 | Delay before re-scan |
spotlight | false | Dark overlay with holes around matching hints |
spotlight_opacity | 0.65 | Darkness of spotlight overlay |
spotlight_radius | 2.5 | Multiplier for spotlight hole radius |
advanced_spotlight_opacity | 0.4 | Spotlight darkness in advanced selection mode |
drag_spotlight_opacity | 0.4 | Spotlight darkness in drag mode |
show_text_boxes | false | Blue bounding boxes around text words |
show_bfs_boxes | false | Red bounding boxes around BFS components |
save_debug_images | false | Save intermediate debug PNGs to /tmp/qhints_debug/ |
Application rules
Per-application overrides keyed by the application’s WM_CLASS name.
{
"application_rules": {
"firefox": {
"canny_min_val": 20,
"canny_max_val": 50,
"kernel_size": 3
},
"default": {
"canny_min_val": 15,
"canny_max_val": 40,
"kernel_size": 3
}
}
}
| Field | Default | Description |
|---|---|---|
scale_factor | 1.0 | HiDPI coordinate scale |
detection_scale | 1.0 | Screenshot upscale before CV (1–4) |
states | [24, 25, 30] | AT-SPI state filter (Sensitive, Showing, Visible) |
states_match_type | 1 (ALL) | Match type: 1=all, 3=none |
roles | excluded roles | AT-SPI role filter |
roles_match_type | 3 (NONE) | Match type: 1=all, 3=none |
canny_min_val | 15 | Canny low threshold |
canny_max_val | 40 | Canny high threshold |
kernel_size | 3 | Dilation kernel |
center_zone_padding | 0.2 | Per-rule center zone padding |
Architecture
Data flow
main.rs ──→ backends (scan window → Vec<Child>)
──→ hints.rs (assign labels → HashMap<label, index>)
──→ overlay GTK window (capture keyboard, draw hints, return action)
──→ main.rs (execute xdotool command)
1. Scan
The focused window’s UI elements are detected through multiple backends:
- AT-SPI runs first (async D-Bus tree walk, 250ms hard deadline)
- Fallback backends run in configured order (imageproc, OCR)
- All results merge into a single
Vec<Child>
2. Filter
- Children smaller than 0.5% of screen dimensions are removed
- Pairwise overlap culling removes duplicates, preferring
TextoverElement - Survivors are re-labeled with fresh short hints
3. Label
hints::get_hints() assigns keyboard labels using a spatial zone grid:
- Screen divided into zones based on
first_key_zonesconfig - Children bucketed into their zone by position
- Overflow redistributes to neighbors, then globally
- Each zone gets single-char or multi-char labels
4. Show
A transparent GTK3 popup window is positioned over the target window. Cairo renders hint labels, markers, and overlays. Keyboard is grabbed.
5. Input
Key events feed through a state machine that tracks:
typedprefix buffer- Current mode (normal, text selection, drag, double-click)
- Advanced mode sub-state
- Marker positions and offsets
6. Act
A MouseAction is returned to main.rs and dispatched via xdotool:
- click:
xdotool mousemove X Y click 1 - hover:
xdotool mousemove X Y - drag: interpolated mousemove from source to destination
- select: mousedown at start, mousemove to end, mouseup
Key data structures
Child {
relative_position: (f64, f64), // relative to window top-left
absolute_position: (f64, f64), // screen coordinates
width: f64,
height: f64,
kind: ChildKind // Element | Text
}
MouseAction {
action: String, // "click" | "hover" | "drag" | "select"
x, y: i32, // primary coordinates
end_x, end_y: i32, // secondary coordinates (for drag/select)
button: u32, // mouse button
repeat: u32, // click count
hunt_continue: bool, // continue hunt loop?
drag_fullscreen: bool // trigger fullscreen re-scan?
}
ChildKind system
ChildKind::Text— word-level content from OCR or imageproc text detection- Gets blue border in text selection mode
- Selection snaps to word edges
ChildKind::Element— UI components (buttons, icons, BFS components)- Normal border always
- Selection uses element center (left-to-right)
Hint generation algorithm
first_key_zones (ragged grid: e.g. 10/9/7 columns)
│
map child (rx, ry) → (row, col) zone
│
per-zone capacity = len(zone_keys) × alphabet_len
│
overflow? → redistribute to neighbors (spatial)
still overflow? → global redistribution
│
within capacity? → single-char keys
within 2-char? → first_key + alphabet_char
else → first_key + r1 + r2 (3-char)
Center-zone children get priority for shorter (single-char) labels. Periphery zones prefer 2-char to reserve short labels for center elements.
Overlap culling (two-pass)
Pass 1 (main.rs): Pairwise overlap on raw children before labeling
- When overlap exceeds threshold:
Textwins overElement, otherwise the larger child survives
Pass 2 (drawing.rs): On rendered hint label rectangles
- Deterministic: keep the first (top-left) visible hint
- Uses
hint_overlap_thresholdconfig (default 60%)
Threading
| Operation | Thread | Timeout |
|---|---|---|
| X11 init | Spawned thread | 2s |
| AT-SPI tree walk | Spawned thread with tokio | 250ms |
| imageproc scan | Spawned thread | 5s |
| OCR scan | Spawned thread | 15s |
| GTK main loop | Main thread | — |
| Pulse animation | GLib timer | marker_pulse_interval_ms |
Note: Spawned threads are NOT cancelled on timeout (Rust limitation). Internal backends have their own timeouts, and the lock file prevents >2-3 orphaned threads.
Lock file
A lock file at /tmp/qhints.lock prevents multiple concurrent instances.
Uses flock(LOCK_EX | LOCK_NB).
Backends
Backends are configured via the backends config field. All configured backends
run in order and their results are merged.
Default: ["imageproc"]
{ "backends": ["atspi", "imageproc", "ocrs"] }
Merge pipeline
AT-SPI ──────┐
imageproc ───┤── merge ──→ filter tiny ──→ overlap cull ──→ label
ocrs ────────┘
When multiple backends produce overlapping children:
- Text references from fallback backends reclassify BFS components (
Element→Text) when overlap exceeds 95% - Original backend Text references are discarded — only BFS components survive
- Pairwise overlap culling prefers
TextoverElement
AT-SPI (async D-Bus)
Connects to the system’s accessibility bus via atspi + zbus. Walks the
accessibility tree of the focused window.
- Queries roles, states, and geometry via batched async calls
- Max recursion depth: 20
- Max children per level: 500
- Filters by application state (Sensitive + Showing + Visible)
- Excludes roles: PANEL, FRAME, MENU_BAR, TOOL_BAR, LIST, SCROLL_PANE, TABLE, etc.
- Timeout: 150ms tokio timeout + 250ms hard deadline
Text role detection
The following AT-SPI roles produce ChildKind::Text:
Label(36), Text(74), DocumentText(87), Static(116), Paragraph(73), Heading(83)
Everything else becomes ChildKind::Element.
Requirements
systemctl --user status at-spi-dbus-bus.service
Some applications (VS Code, browsers) may not expose accessibility info unless launched with appropriate flags.
Imageproc (computer vision)
Screenshot-based detection. Runs when "imageproc" is in the backends list.
Pipeline
- Screenshot — X11
GetImageon the window region - Grayscale conversion — dual pass: max-of-RGB (for edges) and weighted luminance (for text detection)
- Canny edge detection — configurable min/max thresholds and detection scale
- Text word detection — horizontal projection → text line bands → vertical projection per line → word segments
- Dilation — morphological dilation to connect nearby edges
- BFS — 4-direction flood fill → connected components
- Filter — remove components >50% of window
- Overlap analysis — BFS components overlapping text words by >95% are
reclassified as
Text
Output
- BFS connected components →
ChildKind::Element - Text line/word segments →
ChildKind::Text - Debug images saved to
/tmp/qhints_debug/whendev.save_debug_imagesis enabled
Per-app tuning
{
"application_rules": {
"firefox": {
"canny_min_val": 20,
"canny_max_val": 50,
"kernel_size": 3,
"detection_scale": 1.0
}
}
}
Higher canny_min_val = fewer edges (sparser detection).
Higher detection_scale = upscale before detection (more detail, slower).
Timeout: 5 seconds.
OCR (optional)
Feature-gated. Build with --features ocr to enable.
Uses ocrs + rten for text detection and recognition. Downloads pre-trained
models (~35MB) from AWS S3 on first run to ~/.cache/qhints/ocrs/.
Note: Requires clang + libclang-dev system packages.
Pipeline
- Screenshot — same X11
GetImageas imageproc - OCR —
ocrs::OcrEnginedetects word bounding boxes - Word boxes →
ChildKind::Text - BFS gap-filling — Canny edge detection + dilation + BFS on the same screenshot to find non-text elements
- Filter — BFS components overlapping OCR words by >30% are removed
- Merge — BFS components appended first, then word boxes
Models
text-detection.rten— finds text regionstext-recognition.rten— recognizes characters (not used for positioning)
Timeout: 15 seconds.
Troubleshooting
Overlay doesn’t appear
Check you’re on X11:
echo $XDG_SESSION_TYPE # should say "x11"
Verify AT-SPI is running:
systemctl --user status at-spi-dbus-bus.service
Run from a terminal with -v to see logs:
./qhints-rs -v
Keyboard input not reaching overlay
The keyboard grab can fail with some window managers. Click on the overlay
to dismiss it (acts as safety net). Use -vv for trace-level logs.
No elements detected
Some applications (VS Code, browsers) don’t expose accessibility info.
Try adding imageproc to your backends:
{ "backends": ["imageproc"] }
If that works but AT-SPI doesn’t, your app may need --force-renderer-accessibility
or similar flags.
“qhints already running”
The lock file at /tmp/qhints.lock prevents concurrent instances. If qhints
crashed, remove it:
rm /tmp/qhints.lock
Overlay on wrong monitor
Use overlay_x_offset and overlay_y_offset to shift. With i3 + multi-monitor,
the overlay may need adjustments:
{
"overlay_x_offset": 1920,
"overlay_y_offset": 0
}
Can’t find config file
Config path: ~/.config/qhints/config.json. Create it if missing — all
fields are optional.
Hints too large / too small
Adjust hint_font_size and hint_height:
{
"hints": {
"hint_font_size": 12,
"hint_height": 18
}
}
Selection/drag precision issues
Decrease nudge step values for finer control:
{
"hints": {
"text_select_nudge_step_x": 0.01,
"text_select_nudge_step_y": 0.01
}
}
Build fails
Ensure system deps are installed:
sudo apt install xdotool libgtk-3-dev librsvg2-dev clang libclang-dev
For the OCR feature (enabled by default), you need clang + libclang-dev.
Disable OCR if you don’t need it:
cargo build --release --no-default-features
GPU/performance
The overlay uses software rendering via Cairo. On slower systems, increase
marker_pulse_interval_ms to reduce redraw frequency:
{
"hints": {
"marker_pulse_interval_ms": 50
}
}
Wayland
qhints-rs depends on:
x11rb— window geometry and focus trackingxdotool— mouse input simulation- GTK3 overlay (may work under XWayland, untested)
Wayland support would require:
- A Wayland client library for window info
ext-image-capture-srcfor screenshotslibei/ydotoolfor input
Contributions welcome.
src/child.rs
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum ChildKind {
Element,
Text,
}
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct Child {
pub relative_position: (f64, f64),
pub absolute_position: (f64, f64),
pub width: f64,
pub height: f64,
pub kind: ChildKind,
}
}
src/config.rs
Constants
#![allow(unused)]
fn main() {
pub const ATSPI_STATE_SENSITIVE: i32 = 24;
pub const ATSPI_STATE_SHOWING: i32 = 25;
pub const ATSPI_STATE_VISIBLE: i32 = 30;
pub const ATSPI_MATCH_ALL: i32 = 1;
pub const ATSPI_MATCH_NONE: i32 = 3;
pub const EXCLUDED_ROLES: &[i32] = &[
39, 85, 25, 23, 34, 63, 31, 38, 121, 49, 55, 99, 116, 83, 73, 123, 110, 20, 122
];
}
Private helpers
#![allow(unused)]
fn main() {
fn config_path() -> PathBuf
fn hex_to_rgba(hex: &str) -> Option<(f64, f64, f64, f64)>
}
Structs
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Deserialize)]
pub struct HintStyle {
pub hint_height: f64,
pub hint_width_padding: f64,
pub hint_font_size: f64,
pub hint_font_face: String,
pub hint_font_r: f64,
pub hint_font_g: f64,
pub hint_font_b: f64,
pub hint_font_a: f64,
pub hint_first_font_r: f64,
pub hint_first_font_g: f64,
pub hint_first_font_b: f64,
pub hint_first_font_a: f64,
pub hint_first_font_size_boost: f64,
pub hint_overlap_threshold: f64,
pub hint_pressed_font_r: f64,
pub hint_pressed_font_g: f64,
pub hint_pressed_font_b: f64,
pub hint_pressed_font_a: f64,
pub hint_upercase: bool,
pub hint_background_r: f64,
pub hint_background_g: f64,
pub hint_background_b: f64,
pub hint_background_a: f64,
pub hint_border_r: f64,
pub hint_border_g: f64,
pub hint_border_b: f64,
pub hint_border_a: f64,
pub hint_border_width: f64,
pub hint_corner_radius: f64,
pub text_select_border_r: f64,
pub text_select_border_g: f64,
pub text_select_border_b: f64,
pub text_select_border_a: f64,
pub text_select_padding_left: f64,
pub text_select_padding_right: f64,
pub text_select_advanced_key: u32,
pub drag_advanced_key: u32,
pub text_select_nudge_step_x: f64,
pub text_select_nudge_step_y: f64,
pub text_select_nudge_step_shift_x: f64,
pub text_select_nudge_step_shift_y: f64,
pub drag_fullscreen_default: bool,
pub drag_delay_ms: u64,
pub text_select_pulse_period_ms: u64,
pub marker_pulse_interval_ms: u64,
pub marker_bright_duration_ticks: u32,
pub advanced_border_extra_width: f64,
pub drag_marker_shape: String,
pub drag_marker_size: f64,
pub hint_shadow: bool,
pub hint_shadow_r: f64,
pub hint_shadow_g: f64,
pub hint_shadow_b: f64,
pub hint_shadow_a: f64,
pub hint_shadow_offset_x: f64,
pub hint_shadow_offset_y: f64,
pub text_selection_show_boxes: bool,
pub drag_show_boxes: bool,
pub hint_opacity: f64,
}
impl Default for HintStyle
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy)]
pub struct ZonePadding {
pub top: f64,
pub right: f64,
pub bottom: f64,
pub left: f64,
}
impl ZonePadding {
pub fn uniform(pad: f64) -> Self
}
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct DevOptions {
pub show_grid: bool,
pub hunt: bool,
pub hunt_timeout_ms: u32,
pub spotlight: bool,
pub spotlight_opacity: f64,
pub spotlight_radius: f64,
pub advanced_spotlight_opacity: f64,
pub drag_spotlight_opacity: f64,
pub show_text_boxes: bool,
pub show_bfs_boxes: bool,
pub save_debug_images: bool,
}
impl Default for DevOptions
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct ApplicationRule {
pub scale_factor: f64,
pub detection_scale: f64,
pub states: Vec<i32>,
pub states_match_type: i32,
pub roles: Vec<i32>,
pub roles_match_type: i32,
pub canny_min_val: i32,
pub canny_max_val: i32,
pub kernel_size: i32,
pub center_zone_padding: ZonePadding,
}
impl Default for ApplicationRule
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct Config {
pub hints: HintStyle,
pub complementary_keys_alphabet: String,
pub exit_key: u32,
pub hover_modifier: u32,
pub double_click_key: u32,
pub advanced_modifier: u32,
pub drag_key: u32,
pub text_select_key: u32,
pub overlay_x_offset: i32,
pub overlay_y_offset: i32,
pub application_rules: HashMap<String, ApplicationRule>,
pub backends: Vec<String>,
pub first_key_zones: Vec<Vec<String>>,
pub center_zone_padding: ZonePadding,
pub dev: DevOptions,
}
impl Default for Config
}
Public functions
#![allow(unused)]
fn main() {
pub fn load_config() -> Config
}
Loads config from $XDG_CONFIG_HOME/qhints/config.json, merging over
Config::default(). If the file doesn’t exist, returns defaults.
Private functions
#![allow(unused)]
fn main() {
fn merge_user_config(config: &mut Config, json: &serde_json::Value)
}
Recursive merge of user JSON into Config. Handles all top-level fields,
hints.*, dev.*, backends, application_rules, first_key_zones,
and center_zone_padding.
src/hints.rs
Public functions
#![allow(unused)]
fn main() {
pub fn get_hints(
children: &[Child],
complementary_keys_alphabet: &str,
first_key_zones: &[Vec<String>],
center_zone_padding: &config::ZonePadding,
window_size: Option<(f64, f64)>,
) -> HashMap<String, usize>
}
Returns a map of hint label string → child index.
Private functions
#![allow(unused)]
fn main() {
fn get_zone(
rx: f64,
ry: f64,
width: f64,
height: f64,
rows: usize,
col_counts: &[usize],
) -> (usize, usize)
}
Normalizes (rx, ry) by (width, height) and maps to ragged grid (row, col).
#![allow(unused)]
fn main() {
fn neighbors(
r: usize,
c: usize,
rows: usize,
col_counts: &[usize],
) -> Vec<(usize, usize)>
}
Returns all 8-directional adjacent zones that exist (respecting ragged rows),
sorted by Euclidean distance from (r, c).
#![allow(unused)]
fn main() {
fn generate_product(
chars: &[char],
repeat: usize,
out: &mut Vec<String>,
)
}
Recursive cartesian product. Generates all possible Strings of length repeat
from chars. Used as fallback when no window_size is available.
src/mouse.rs
Note: Unused. All mouse dispatch is inlined in main.rs.
#![allow(unused)]
fn main() {
pub fn click(
x: i32,
y: i32,
button: u32,
repeat: u32,
) -> Result<(), Box<dyn std::error::Error>>
}
Shells out to xdotool mousemove X Y click BUTTON, repeated repeat times.
src/main.rs
Module declarations
#![allow(unused)]
fn main() {
mod backend;
mod child;
mod config;
mod hints;
mod mouse;
mod overlay;
mod window_system;
}
CLI
#![allow(unused)]
fn main() {
#[derive(Parser)]
#[command(name = "qhints-rs", about = "Keyboard-driven UI navigation for Linux")]
struct Cli {
#[arg(short, long, default_value = "hint")]
mode: String,
#[arg(short, long, action = clap::ArgAction::Count)]
verbose: u8,
}
}
Public functions
fn main()
- Parse CLI args
- Set log level: 0=Info, 1=Debug, 2+=Trace
- Load config
- Dispatch:
"hint"→hint_mode(),"scroll"→ warn not implemented
Private functions
#![allow(unused)]
fn main() {
fn try_acquire_lock() -> Option<std::fs::File>
}
Creates /tmp/qhints.lock and tries flock(LOCK_EX | LOCK_NB). Returns Some(file)
on success, None if another instance holds the lock.
#![allow(unused)]
fn main() {
fn with_thread_timeout<T: Send + 'static>(
f: impl FnOnce() -> T + Send + 'static,
timeout: std::time::Duration,
label: &'static str,
) -> Option<T>
}
Spawns a thread for f(), waits up to timeout for a result via mpsc::channel.
Returns Some(T) on success, None on timeout or thread panic.
Thread is not cancelled on timeout. Backends have internal timeouts and the lock file prevents >2-3 orphaned threads.
#![allow(unused)]
fn main() {
fn hint_mode(config: &config::Config, total_start: Instant)
}
Flow
- Lock:
try_acquire_lock()— exit if already running - X11 init:
X11::new()viawith_thread_timeout(2s) - App rule: Lookup
ApplicationRulebywin_info.app_name, fallback to"default"
Hunt loop (repeat until done)
A. AT-SPI scan (async threaded)
tokio::time::timeout(150ms, async { AtspiBackend::new(...).get_children().await })
→ rx.recv_timeout(250ms)
→ Vec<Child> or empty
B. Fallback backends (in config order, skip “atspi”)
"imageproc":with_thread_timeout(|| imageproc::get_children(), 5s)"ocrs":with_thread_timeout(|| ocrs::get_children(), 15s)
C. Merge fallback text references
- Find
Textindices infallback_children - Reclassify
Element→Textwhere overlap > 95% - Discard original backend
Textreferences, keep only BFS
D. Filter tiny children
Remove children < 0.5% of screen (min 3px on each axis)
E. Initial labeling
hints::get_hints(&children, &alphabet, &zones, &padding, Some((w, h)))
F. Overlap culling & relabeling
Threshold: (100 - hint_overlap_threshold) / 100
Pairwise comparison:
Textwins overElement- Otherwise smaller child is culled
If any culled → re-label survivors with fresh short hints
G. Show overlay
overlay::show_overlay(config, &hint_map, &children, x, y, w, h, None)
→ Option<MouseAction>
H. Dispatch action
action.action | xdotool command |
|---|---|
"click" | mousemove X Y [click BUTTON] × repeat |
"hover" | mousemove X Y |
"drag" | mousemove SX SY; sleep; mousedown; steps(8px); sleep; mouseup |
"select" | mousemove SX SY; sleep; mousedown; mousemove EX EY; sleep; mouseup |
If drag_fullscreen:
- Get screen size
- Run all non-atspi backends on full screen
- Show second overlay with
preset_drag_source - Execute drag with interpolation
If hunt_continue: sleep(hunt_timeout_ms), continue loop
Otherwise: break
src/overlay/
src/overlay/mod.rs
Public types
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum ActiveHook {
Start,
End,
}
}
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct MouseAction {
pub action: String,
pub x: i32,
pub y: i32,
pub end_x: i32,
pub end_y: i32,
pub button: u32,
pub repeat: u32,
pub hunt_continue: bool,
pub drag_fullscreen: bool,
}
}
Private state
#![allow(unused)]
fn main() {
struct OverlayState {
config: Config,
hints: HashMap<String, usize>,
children: Vec<Child>,
typed: String,
mouse_action: Rc<RefCell<Option<MouseAction>>>,
window_size: (f64, f64),
window_origin: (i32, i32),
hunt: bool,
hunt_exit_next: bool,
text_selection_mode: bool,
advanced_mode: bool,
active_hook: ActiveHook,
selection_start_child: Option<usize>,
selection_start_offset_x: f64,
selection_start_offset_y: f64,
selection_end_child: Option<usize>,
selection_end_offset_x: f64,
selection_end_offset_y: f64,
consumed_hints: Vec<usize>,
double_click_mode: bool,
drag_mode: bool,
pulse_bright_remaining: u32,
drag_advanced_mode: bool,
drag_source_pos: Option<(f64, f64)>,
drag_source_size: (f64, f64),
drag_dest_child: Option<usize>,
drag_source_offset_x: f64,
drag_source_offset_y: f64,
drag_dest_offset_x: f64,
drag_dest_offset_y: f64,
}
}
Public functions
#![allow(unused)]
fn main() {
pub fn show_overlay(
config: &Config,
hints: &HashMap<String, usize>,
children: &[Child],
x: i32,
y: i32,
width: i32,
height: i32,
preset_drag_source: Option<(i32, i32)>,
) -> Option<MouseAction>
}
Creates a transparent GTK3 popup, draws hints, captures keyboard, runs gtk::main().
Returns MouseAction on user input, None if dismissed.
Private functions
#![allow(unused)]
fn main() {
fn select_position(
child: &Child,
start: bool,
pad_left: f64,
pad_right: f64,
) -> (i32, i32)
}
Computes screen coordinates for selection action:
start = true:child.left - pad_left × width,child.top + height / 2start = false:child.right + pad_right × width,child.top + height / 2
Overlay setup
gtk::init()- Create
gtk::Window::new(Popup):app_paintable = true,decorated = falseskip_taskbar_hint = true,skip_pager_hint = trueaccept_focus = false,can_focus = falsetype_hint = Notification- RGBA visual for transparency
- Add
DrawingAreafor cairo rendering - Initialize
OverlayState window.realize()then:set_override_redirect(true)move_resize(x + overlay_x_offset, y + overlay_y_offset, width, height)
- Connect signals: draw, key press, button press, show, destroy
Timers
| Timer | Duration | Trigger |
|---|---|---|
| Hunt idle | 10s inactivity | Dismiss overlay |
| Safety | 5s (extended in selection/drag) | Force main_quit |
| Pulse | marker_pulse_interval_ms (16-500ms) | queue_draw if markers visible |
Key event state machine
1. Escape → dismiss overlay
2. Ctrl (hunt) → set hunt_exit_next = true
3. text_select_key (/) → toggle text selection mode
4. text_select_key again with start placed → toggle advanced mode
5. double_click_key (Alt) → toggle double-click mode
6. Shift after source placed → fullscreen drag re-scan
7. drag_key (Shift) → toggle drag mode
8. advanced_modifier / per-mode key → toggle advanced/drag_advanced
9. Tab → switch active hook (Start ↔ End)
10. Enter (drag advanced) → confirm drag, dismiss
11. Enter (advanced select) → confirm selection, dismiss
12. Arrow keys → nudge active hook position
13. Unicode char → append to typed, match against hints
- Exact match → set end marker (selection) / fire action (normal/drag)
- Prefix match → redraw with matching hints highlighted
- No match → clear typed
src/overlay/drawing.rs
Public functions
#![allow(unused)]
fn main() {
pub fn draw_hints(
cr: &Context,
config: &Config,
hints: &HashMap<String, usize>,
children: &[Child],
typed: &str,
consumed_hints: &[usize],
text_selection_mode: bool,
selection_start_child: Option<usize>,
selection_start_offset_x: f64,
selection_start_offset_y: f64,
selection_end_child: Option<usize>,
selection_end_offset_x: f64,
selection_end_offset_y: f64,
advanced_mode: bool,
active_hook: ActiveHook,
double_click_mode: bool,
drag_mode: bool,
drag_advanced_mode: bool,
drag_source_pos: Option<(f64, f64)>,
drag_source_size: (f64, f64),
drag_source_offset_x: f64,
drag_source_offset_y: f64,
drag_dest_child: Option<usize>,
drag_dest_offset_x: f64,
drag_dest_offset_y: f64,
window_origin: (i32, i32),
pulse_bright_remaining: u32,
marker_bright_duration_ticks: u32,
drag_marker_square: bool,
drag_marker_size: f64,
show_text_boxes: bool,
show_bfs_boxes: bool,
text_selection_show_boxes: bool,
drag_show_boxes: bool,
window_size: (f64, f64),
)
}
Drawing order
- Clear to transparent
- Hint label boxes (shadow → background → border → per-character text)
- Spotlight radial gradient holes (around matching hints)
- Spotlight selection rectangle (between markers in advanced mode)
- Text selection markers (start: red, end: orange)
- Drag markers (source: red dot, dest: green dot)
- Text selection bounding boxes (blue)
- Drag mode bounding boxes (green)
- Dev debug: BFS component boxes (red borders)
- Dev debug: Text word boxes (blue borders)
- Dev debug: Zone grid boundaries
Private functions
#![allow(unused)]
fn main() {
fn draw_rounded_rect(
cr: &Context,
x: f64,
y: f64,
w: f64,
h: f64,
r: f64,
)
}
Draws a rounded rectangle path. r clamped to min(w/2, h/2).
#![allow(unused)]
fn main() {
fn overlap_fraction(
a: (f64, f64, f64, f64),
b: (f64, f64, f64, f64),
) -> f64
}
Returns intersection_area / min(area_a, area_b). Used for hint label overlap culling.
src/window_system/
src/window_system/mod.rs
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct WindowInfo {
pub extents: (i32, i32, i32, i32),
pub pid: u32,
pub app_name: String,
}
}
#![allow(unused)]
fn main() {
pub trait WindowSystem {
fn focused_window(&self) -> &WindowInfo;
}
}
src/window_system/x11.rs
#![allow(unused)]
fn main() {
pub struct X11 {
info: WindowInfo,
}
impl X11 {
pub fn new() -> Result<Self, Box<dyn std::error::Error>>
}
impl WindowSystem for X11 {
fn focused_window(&self) -> &WindowInfo
}
}
X11::new() sequence
RustConnection::connect(None)— open X11 display- Intern atoms:
_NET_ACTIVE_WINDOW,_NET_WM_PID,WM_CLASS get_property(_NET_ACTIVE_WINDOW, root)→ active window IDget_geometry(active_win)→ window sizetranslate_coordinates(active_win, root, 0, 0)→ absolute positionget_property(_NET_WM_PID)→ PIDget_property(WM_CLASS)→ first null-terminated string as app_name
#![allow(unused)]
fn main() {
pub fn screen_size() -> Result<(i32, i32), Box<dyn std::error::Error>>
}
Connects to X11, returns (width_in_pixels, height_in_pixels) of root window.
src/backend/
src/backend/mod.rs
#![allow(unused)]
fn main() {
pub mod atspi;
pub mod imageproc;
#[cfg(feature = "ocr")]
pub mod ocrs;
}
src/backend/atspi.rs
Async D-Bus accessibility tree walker.
#![allow(unused)]
fn main() {
pub struct AtspiBackend {
conn: Connection,
rule: ApplicationRule,
window_info: WindowInfo,
scale_factor: f64,
}
impl AtspiBackend {
pub async fn new(
window_info: WindowInfo,
rule: ApplicationRule,
) -> Result<Self, Box<dyn std::error::Error>>
pub async fn get_children(
&self,
) -> Result<Vec<Child>, Box<dyn std::error::Error>>
}
}
get_children() flow
find_active_window()— iterate D-Bus tree for app → window matching PID + Active statewalk_children(proxy, &mut children, 0)— recursive, max depth 20, max 500/level- Batch-fetch children via
futures::join_all - Query role, state, extents via
tokio::join! - Filter: ALL of (Sensitive, Showing, Visible) AND NONE of EXCLUDED_ROLES
- Role →
ChildKind::Textifis_text_role(), elseElement - Recurse via
Box::pin(self.walk_children(...))
- Batch-fetch children via
Private methods
#![allow(unused)]
fn main() {
async fn find_active_window(
&self,
) -> Result<Option<AccessibleProxy<'_>>, zbus::Error>
async fn walk_children(
&self,
proxy: &AccessibleProxy<'_>,
children: &mut Vec<Child>,
depth: usize,
) -> Result<(), Box<dyn std::error::Error>>
fn is_text_role(role: i32) -> bool
}
is_text_role() matches
| Role ID | Name |
|---|---|
| 36 | Label |
| 74 | Text |
| 87 | DocumentText |
| 116 | Static |
| 73 | Paragraph |
| 83 | Heading |
src/backend/imageproc.rs
Computer vision: screenshot → edge detection → BFS → text lines.
Statics
#![allow(unused)]
fn main() {
pub static DEBUG_BFS_COMPONENTS: Mutex<Vec<Child>>
pub static SAVE_DEBUG_IMAGES: AtomicBool
}
Public functions
#![allow(unused)]
fn main() {
pub fn get_children(
window_info: &WindowInfo,
rule: &ApplicationRule,
) -> Result<Vec<Child>, Box<dyn std::error::Error>>
}
Pipeline
- Screenshot: X11
GetImage(Z_PIXMAP, root, x, y, w, h)→ BGRA buffer - Dual grayscale (single pass over BGRA):
luma:0.299R + 0.587G + 0.114B— text detectionprocess_img:max(R, G, B)— preserves color edge contrast
- Canny edge detection on
process_img:- Resize by
detection_scale(Nearest neighbor) imageproc::edges::canny(src, canny_min_val, canny_max_val)
- Resize by
- Text word detection (
detect_text_words):- Resize
lumaby same scale - Horizontal projection → text line bands (threshold 0.5% width, min 8px, max gap 2% height)
- Vertical projection per line → word segments (gap threshold 25% line-height, min 3px gap, min 4px word)
- Scale back by
inv_scale
- Resize
- Dilation:
imageproc::morphology::dilate(edges, LInf, kernel_size/2) - BFS: 4-direction flood fill → connected components as
ChildKind::Element - Filter: Remove components >50% of window
- Overlap analysis:
- BFS overlapping text words >95% →
ChildKind::Text - Words overlapping 2+ BFS → added as separate
Text
- BFS overlapping text words >95% →
- Debug images: Saved to
/tmp/qhints_debug/ifSAVE_DEBUG_IMAGES
Private functions
#![allow(unused)]
fn main() {
fn detect_text_words(
edges: &GrayImage,
_luma: &GrayImage,
img_w: u32,
img_h: u32,
win_x: u32,
win_y: u32,
) -> Vec<Child>
fn draw_boxes(
luma: &GrayImage,
words: &[Child],
all_bfs: &[Child],
kept: &[Child],
path: &Path,
) -> Result<(), Box<dyn std::error::Error>>
}
src/backend/ocrs.rs (feature-gated)
OCR-based detection via ocrs + rten.
Constants
#![allow(unused)]
fn main() {
const DETECTION_MODEL: &str = "https://ocrs-models.s3-accelerate.amazonaws.com/text-detection.rten";
const RECOGNITION_MODEL: &str = "https://ocrs-models.s3-accelerate.amazonaws.com/text-recognition.rten";
}
Public functions
#![allow(unused)]
fn main() {
pub fn get_children(
window_info: &WindowInfo,
rule: &ApplicationRule,
) -> Result<Vec<Child>, Box<dyn std::error::Error>>
}
Pipeline
- Screenshot: X11
GetImage(same as imageproc) - RGB conversion: BGRA → flat RGB array; also build luma for BFS
- Model download: Download models to
~/.cache/qhints/ocrs/- Skips if already cached
- OCR:
ocrs::OcrEngine::newwith detection + recognition modelsengine.prepare_input(ImageSource)→ocr_inputengine.detect_words(&ocr_input)→ word bounding boxes asChildKind::Text
- BFS gap-filling: Canny + dilation + BFS on luma →
ChildKind::Element - Filter: Remove BFS components overlapping OCR words >30%
- Merge: BFS first, then word boxes appended
Private functions
#![allow(unused)]
fn main() {
fn cache_dir() -> Result<PathBuf, Box<dyn std::error::Error>>
fn download_model(
url: &str,
path: &Path,
) -> Result<(), Box<dyn std::error::Error>>
}