Docs
Blog
Pricing

Join now

Vision and roadmap

Where we're headed and how we'll get there

We’re building a data engineering platform focused on developer experience. Our goal is to empower data scientists to quickly turn ad-hoc data science into dependable data services.

There is a chasm between data science and data engineering. Data science is creative, focused on understanding data and finding insights and use cases. Data engineering is operational: integrating infrastructure and data sources, deploying pipelines, managing permissions, and tracking data quality and lineage.

Too much good data science stays ad-hoc because the engineering overhead is too great. To trigger real-time alerts, respond to user input, or integrate analytics in your app, you quickly wind up with Postgres, BigQuery, Fivetran, DBT, maybe Kafka, an API gateway, and homebaked Python data pipelines. Sound familiar?

We’re pursuing a unified approach that combines data storage, processing, and visualization with data quality management and governance in one platform. We’re inspired by services like Gitlab and Netlify that make it remarkable easy for developers to build and run web apps. We want to give data scientists a platform to deploy, monitor, derive, visualize, integrate, and share analytics.

Getting there is a journey and we’re starting with data storage in the form of streams you can replay, subscribe to, query with SQL, monitor, and share. Streams are the primitive the rest of our roadmap builds upon.

We hope you’re excited about our journey. We love to get feedback, so if you’re up for it, reach out. If you just want to stay updated on our progress, follow us on Twitter or sign up for our newsletter:

Roadmap

Data storage
- Log streaming with at-least-once delivery
- Key-based table indexing for fast lookups
- Data warehouse replication for OLAP queries (SQL)
- Data versioning
- Secondary table indexes
- Schema evolution and migration
- Log features: partitioning, compaction, multi-consumer subscriptions
- Strongly consistent table operations for OLTP
- Full-text search engine
- Geo-replicated storage
Data processing
- (In progress) Scheduled/triggered SQL queries
- Compute sandbox for batch and streaming pipelines
- Git-based stream, query and pipeline deployments
- Data app catalog (one-click parameterized deployments)
- DAG view of streams and deriving pipelines for data lineage
Data visualization and exploration
- (In progress) Vega-based charts
- (In progress) Dashboards composed from charts and tables
- Alerting layer
- Python notebooks (Jupyter)
Data quality and governance
- Web console for creating and browsing resources
- Usage dashboards for streams, services, users and organizations
- Custom data usage quotas
- Granular permissions management, including public streams
- Service accounts with custom permissions and quotas
- API secrets (tokens) that can be issued/monitored/revoked
- Field validation rules, checked on write
- Data quality tests
- Data search and discovery
- Audit logs as meta-streams
Integrations
- gRPC, REST and websockets APIs
- Command-line interface (CLI)
- Python client
- JS and React client
- PostgreSQL wire-protocol compatibility
- GraphQL API for data
- Row restricted access tokens for identity-centered apps
- Self-hosted Beneath on Kubernetes with federation

© Beneath Systems

Contact Policies Team