Data Collections
Data Collections are the foundation of Octigen's data integration system. They organize your data files and define how those files are identified and accessed.
What is a Data Collection?
A Data Collection is:
- A container for related data files
- A schema defining how files are named and organized
- A connection point between your data and slide templates
Think of it as a smart folder that knows what kind of files belong inside and how to find the right one when asked.
Creating a Data Collection
Navigate to Data Collections
- Go to your space's main page
- Click the "Data" tab
- Click "Add Data Collection"
Basic Information
Collection Name
- Human-readable name for the collection
- Example: "Monthly Portfolio Holdings"
Source Format
- CSV - Comma-separated values; We support various formats such as comma-separated, semicolon-separated and tab-separated; As for encodings, please ensure that your CSV files are UTF-8 encoded.
- Excel (.xlsx) - Microsoft Excel files with single sheets and raw data (no calculated columns supported at this point).
Source Connection
Currently only one type of source connection is supported: Manual upload.
Data Type
Currently only one data type is supported: Simple 2D table with header row.
Defining Parameters
Parameters identify individual files within the collection. For each parameter, you define:
Name
- The parameter name (lowercase, use underscores)
- Examples:
valuation_date,portfolio_id,version
Type
date- Date values (represented in YYYY-MM-DD format within Octigen)string- Text values (portfolio IDs, names)integer- Numeric values (versions, sequence numbers)
Mapping
prefix- Parameter value appears at the start of the filenamesuffix- Parameter value appears at the end of the filename- Coming soon: mapping by sheet name, mapping by column value.
Note: Since we currently have the two mappings "prefix" and "suffix", the number of useful parameters are effectively limited to two. This will change with later versions of Octigen introducing more types of mappings.
Example: Monthly Sales Data
Settings:
- Collection Name: "Monthly Sales Data"
- Source Format: CSV
- Parameters:
valuation_date(type: date, mapping: suffix, required: yes)
Expected filenames:
sales_2024-12-01.csv
sales_2024-11-01.csv
sales_2024-10-01.csvExample: Multi-Portfolio Holdings
Settings:
- Display Name: "Portfolio Holdings"
- Object Type:
portfolio-holdings - File Type: Excel
- Parameters:
portfolio_id(type: string, mapping: prefix, required: yes)valuation_date(type: date, mapping: suffix, required: yes)
Expected filenames:
GROWTH_FUND_2024-12-01.xlsx
INCOME_FUND_2024-12-01.xlsx
GROWTH_FUND_2024-11-01.xlsxUploading Data Files
Drag and Drop Upload
- Open the data collection
- Drag files into the upload area, or click to browse
- Files are validated against the collection's parameter schema
Upload Validation
When you upload a file, Octigen validates:
- Filename matches the expected pattern
- Parameter values can be extracted
- Parameter types are correct (valid dates, integers)
Validation examples:
| Collection Parameters | Filename | Result |
|---|---|---|
valuation_date (suffix, date) | sales_2024-12-01.csv | ✅ Valid |
valuation_date (suffix, date) | sales_20241201.csv | ✅ Valid (alternative format) |
valuation_date (suffix, date) | sales_december.csv | ❌ Invalid date |
portfolio_id (prefix, string), valuation_date (suffix, date) | FUND_A_2024-12-01.csv | ✅ Valid |
Duplicate Detection
If you upload a file with the same parameter values as an existing file:
- System detects the duplicate
- You can choose to overwrite the existing file
- Or cancel and keep the original
This prevents accidental data loss while allowing intentional updates.
Viewing Collection Contents
File List
The collection detail page shows all uploaded files with:
- Filename
- Parameter values extracted from the filename
- File size
- Upload date / last modified
- Actions (download, delete)
Managing Data Files
Downloading
Click the download icon next to any file to download the original.
Deleting
Click the delete icon to remove a file. This action:
- Cannot be undone
- Affects future presentations only
- Does not affect already-committed presentations
Updating
To update data for the same parameters:
- Delete previous file
- Upload the new file with the same filename
- Confirm you want to overwrite
Parameter Mapping Details
Prefix Mapping
The parameter value appears at the start of the filename, followed by an underscore.
{parameter_value}_{rest_of_filename}.{extension}Example: portfolio_id with prefix mapping
- Parameter value:
GROWTH_FUND - Filename:
GROWTH_FUND_2024-12-01.csv
Suffix Mapping
The parameter value appears at the end of the filename (before extension), preceded by an underscore.
{base_name}_{parameter_value}.{extension}Example: valuation_date with suffix mapping
- Parameter value:
2024-12-01 - Filename:
sales_data_2024-12-01.csv
Combining Mappings
Multiple parameters combine in the filename:
{prefix_param}_{base}_{suffix_param}.{extension}Example: portfolio_id (prefix) + valuation_date (suffix)
- Parameters:
portfolio_id=INCOME,valuation_date=2024-12-01 - Filename:
INCOME_holdings_2024-12-01.xlsx
Date Format Flexibility
For date parameters, Octigen recognizes multiple formats in filenames:
| Format | Example | Accepted |
|---|---|---|
| ISO | 2024-12-01 | ✅ |
| Compact | 20241201 | ✅ |
| Dot separator | 2024.12.01 | ✅ |
| Underscore separator | 2024_12_01 | ✅ |
All formats are normalized internally, so:
- Upload:
sales_20241201.csv - Query:
valuation_date=2024-12-01 - Result: ✅ Match found
Best Practices
Naming Conventions
Collection IDs:
- Be descriptive but concise
- Avoid special characters
Parameter Names:
- Use lowercase with underscores:
valuation_date,portfolio_id - Match industry conventions where applicable
- Be consistent across collections
Filenames:
- Follow the parameter mapping pattern exactly
- Use consistent date formats
- Avoid spaces and special characters
Organization
One data type per collection:
- ✅ "Sales Data" collection with sales files
- ❌ "All Data" collection mixing sales, inventory, and HR files
Logical groupings:
- By report type (monthly sales, quarterly reviews)
- By data domain (portfolio, market, client)
- By update frequency (daily, weekly, monthly)
Data Quality
Before uploading:
- Verify column headers are correct
- Check data completeness and quality
Consistent structure:
- Same columns across files in a collection
- Consistent data types per column
- No extra rows or formatting
Common Issues
"Filename doesn't match parameter pattern"
Cause: The filename doesn't match the expected pattern for the collection's parameters.
Solution:
- Check parameter mappings (prefix vs suffix)
- Verify date format if using date parameters
- Ensure all required parameters are in the filename
"Invalid parameter value"
Cause: A parameter value couldn't be parsed correctly.
Solution:
- For dates: use YYYY-MM-DD or YYYYMMDD format
- For integers: use only digits
- For strings: avoid special characters
"Duplicate file detected"
Cause: A file with the same parameters already exists.
Solution:
- First delete previous version of the data object with same parameters.
Next Steps
- Connect Data to Workflows - Create slide deck workflows that always update certain tables and charts in a deterministic way
- Configure Data for Slides - Connect collections to templates
- Use Data in Presentations - Generate data-driven slides