Configuration
Configuring your project is done with a simple YAML file, named quary.yaml and placed at the root of a project. The file defines which database to use and how to connect to it. The following are the different options for configuring the database.
BigQuery
BigQuery is a cloud based database, it is a good choice for large projects and for production. It is operated by Google and is part of the Google Cloud Platform. To use BigQuery you need to create a project in the Google Cloud Platform and enable BigQuery for the project. You can configure the connection to BigQuery with the following configuration.
bigQuery:
projectId: my-project-id
datasetId: my-dataset-idSnowflake
Snowflake is a cloud-based data warehousing platform that offers scalable, on-demand storage and computing power and that separates storage and compute, enabling users to independently scale and pay for each.
snowflake:
accountUrl: account-url
clientId: client-id
clientSecret: client-secret
role: your-quary-role
database: database-name
schema: schema-name
warehouse: warehouse-namePostgres
Postgres is a popular open-source database. It is a good choice for small to medium projects and for production.
postgres:
schema: transformAll connection details are passed as environment variables. The following environment variables are used to connect to the database. PGHOST, PGPORT (if not set defaults to 5432), PGUSER, PGPASSWORD, PGDATABASE. They are the same as those used by the psql command line tool.
Redshift
Redshift is a cloud-based data warehousing service provided by Amazon Web Services (AWS). It is designed for large-scale data storage and analysis.
redshift:
schema: your-schemaAll connection details are passed as environment variables. The following environment variables are used to connect to the database:
RSHOSTRSUSERRSPASSWORDRSDATABASERSPORT(if not set defaults to 5432)
DuckDb
DuckDb is an in-process SQL database, it's a feature-rich SQL dialect that it is a great way to query data locally on your machine.
File Based
The database is stored in a file on the disk and is persisted between runs. In this case the path to the file is specified, and each run creates the file if it does not exist.
duckdb:
path: database.dbIn Memory
The database is created in memory and is lost when the application is closed. It is faster than the file-based version and is good for rapid testing.
duckdbInMemory: {}SQLite
SQLite is a database type that runs locally, it is a file-based database. It is a good choice for small projects and for testing. It can be operated in two manners
In Memory
The database is created in memory and is lost when the application is closed. It is faster than the file-based version and is good for rapid testing.
sqliteInMemory: {}File Based
The other option is to use a file based database, where the database is stored in a file on the disk and is persisted between runs. In this case the path to the file is specified, and each run creates the file if it does not exist. It is useful for testing, for small projects and when you want to use the database in different applications.
sqlite:
path: db.sqliteEnvironment variables
In order to make your set up easier, you can use a .env files to set environment variables. By default, quary will pick up variables inside of .env. You can specify other .env files with the -e flag. For example, the following will read both .env files.
quary init -e='.env' -e='local.env'