Parse and analyze test results, build outputs, and CI/CD pipeline logs from 90+ formats with severity filtering and format auto-detection
Installing and Loading
INSTALL duck_hunt FROM community;
LOAD duck_hunt;
Example
-- Parse build errors (warnings and above)
SELECT ref_file, ref_line, message
FROM read_duck_hunt_log('build.log', 'auto', severity_threshold := 'warning')
WHERE status = 'ERROR';
-- Parse test results with all events
SELECT test_name, status, execution_time
FROM read_duck_hunt_log('pytest.json', 'pytest_json', severity_threshold := 'all')
WHERE status = 'FAIL';
-- Process multiple files, skip failures (new in v1.4.0)
SELECT * FROM read_duck_hunt_log('logs/*.json', ignore_errors := true);
-- Control log_content size for memory efficiency (new in v1.4.0)
SELECT * FROM read_duck_hunt_log('huge.log', content := 200);
SELECT * FROM read_duck_hunt_log('huge.log', content := 'smart');
-- Parse GitHub Actions ZIP archives (new in v1.4.0, requires zipfs)
-- INSTALL zipfs FROM community; LOAD zipfs;
SELECT job_name, severity, message
FROM read_duck_hunt_workflow_log('workflow_run.zip', 'github_actions_zip')
WHERE severity = 'error';
-- Debug format detection
SELECT format, can_parse, events_produced, is_selected
FROM duck_hunt_diagnose_read('build.log');
-- Build health badge
SELECT status_badge(
COUNT(*) FILTER (WHERE status = 'ERROR'),
COUNT(*) FILTER (WHERE status = 'WARNING')
) FROM read_duck_hunt_log('build.log', 'auto');
About duck_hunt
Duck Hunt is a comprehensive DuckDB extension for parsing and analyzing development tool outputs. It provides a unified SQL interface to query test results, build logs, linting output, and CI/CD pipeline data from 90+ tools and formats.
Documentation: https://duck-hunt.readthedocs.io/
Core Table Functions:
read_duck_hunt_log(source, format, severity_threshold, ignore_errors, content)- Parse tool outputs from filesparse_duck_hunt_log(text, format, severity_threshold, content)- Parse tool outputs from stringsread_duck_hunt_workflow_log(source, format, severity_threshold, ignore_errors)- Parse CI/CD workflow logsparse_duck_hunt_workflow_log(text, format, severity_threshold)- Parse workflow strings
New in v1.4.0:
ignore_errorsparameter - Continue processing when individual files failcontentparameter - Control log_content size: integer limit,'smart','none', or'full'github_actions_zipformat - Parse ZIP archives from GitHub Actions (requires zipfs extension)- Workflow delegation - Workflow parsers auto-delegate to tool parsers (make, pytest, etc.)
Diagnostic Functions:
duck_hunt_detect_format(content)- Auto-detect format, returns format name or NULLduck_hunt_diagnose_read(path)- Debug which parsers match a fileduck_hunt_diagnose_parse(content)- Debug which parsers match contentduck_hunt_formats()- List all supported formats with priorities
Scalar Functions:
status_badge(status)- Convert status to badge: [ OK ], [FAIL], [WARN], [ .. ], [ ?? ]status_badge(errors, warnings)- Compute badge from countsstatus_badge(errors, warnings, is_running)- Badge with running state
Severity Filtering:
Filter events by severity level: 'all' < 'info' < 'warning' < 'error' < 'critical'
Supported Formats (90+): See: https://duck-hunt.readthedocs.io/en/latest/formats/
- Dynamic:
regexp:<pattern>- Custom patterns with named capture groups - Test Frameworks: pytest, Go test, Cargo test, JUnit, RSpec, Mocha/Chai, Google Test, NUnit/xUnit
- Linting Tools: ESLint, RuboCop, Pylint, Flake8, MyPy, Clippy, SwiftLint, PHPStan, isort, and more
- Build Systems: CMake, Make, Maven, Gradle, Cargo, MSBuild, Node.js, Python, Spack
- CI/CD Engines: GitHub Actions, GitLab CI, Jenkins, Docker, Drone CI, Terraform
- Distributed Systems: HDFS, Spark, Zookeeper, OpenStack, Android, Blue Gene/L
- Infrastructure: Kubernetes, AWS CloudTrail, VPC Flow, S3 Access, Syslog
- Debugging: Valgrind, GDB/LLDB, strace
Schema Fields (39): See: https://duck-hunt.readthedocs.io/en/latest/schema/
- Reference context: ref_file, ref_line, ref_column (locations mentioned in logs)
- Log source: log_file, log_content, log_line_start, log_line_end
- Core: event_id, tool_name, event_type, status, severity, message
- Error Analysis: fingerprint, similarity_score, pattern_id
- Hierarchy: scope, group, unit, subunit (and corresponding _id, _status fields)
Key Features:
- Automatic format detection with diagnostic debugging
- Transparent compression support (GZIP built-in, ZSTD via parquet)
- GitHub Actions ZIP archive support (via zipfs extension)
- Severity-based event filtering
- Error pattern clustering and fingerprinting
- Multi-file glob processing with ignore_errors for robust batch processing
- Memory-efficient content modes (limit, smart truncation, omit)
- Workflow delegation to tool-specific parsers
- Pipeline integration with stdin support
- Hierarchical CI/CD workflow parsing
Perfect for CI/CD analysis, automated debugging, test aggregation, quality gates, and agent-driven development workflows.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| duck_hunt_detect_format | scalar | NULL | NULL | |
| duck_hunt_diagnose_parse | table | NULL | NULL | |
| duck_hunt_diagnose_read | table | NULL | NULL | |
| duck_hunt_formats | table | NULL | NULL | |
| duck_hunt_match_command_patterns | table_macro | NULL | NULL | |
| parse_duck_hunt_log | table | NULL | NULL | |
| parse_duck_hunt_workflow_log | table | NULL | NULL | |
| read_duck_hunt_log | table | NULL | NULL | |
| read_duck_hunt_workflow_log | table | NULL | NULL | |
| status_badge | scalar | NULL | NULL |