Data Sources
Data Sources
Add Data Source to a Knowledge Base
Create Presigned URLs for Data Source File Upload
Delete a Data Source from a Knowledge Base
List Data Sources for a Knowledge Base
ModelsExpand Collapse
class APIFileUploadDataSource: …
File to upload as data source for knowledge base.
original_file_name: Optional[str]
The original file name
size_in_bytes: Optional[str]
The size of the file in bytes
stored_object_key: Optional[str]
The object key the file was stored as
class APIKnowledgeBaseDataSource: …
Data Source configuration for Knowledge Bases
aws_data_source: Optional[AwsDataSource]
AWS S3 Data Source for Display
bucket_name: Optional[str]
Spaces bucket name
region: Optional[str]
Region of bucket
bucket_name: Optional[str]
Name of storage bucket - Deprecated, moved to data_source_details
created_at: Optional[datetime]
Creation date / time
dropbox_data_source: Optional[DropboxDataSource]
Dropbox Data Source for Display
file_upload_data_source: Optional[APIFileUploadDataSource]
File to upload as data source for knowledge base.
original_file_name: Optional[str]
The original file name
size_in_bytes: Optional[str]
The size of the file in bytes
stored_object_key: Optional[str]
The object key the file was stored as
item_path: Optional[str]
Path of folder or object in bucket - Deprecated, moved to data_source_details
last_datasource_indexing_job: Optional[APIIndexedDataSource]
completed_at: Optional[datetime]
Timestamp when data source completed indexing
data_source_uuid: Optional[str]
Uuid of the indexed data source
error_details: Optional[str]
A detailed error description
error_msg: Optional[str]
A string code provinding a hint which part of the system experienced an error
failed_item_count: Optional[str]
Total count of files that have failed
indexed_file_count: Optional[str]
Total count of files that have been indexed
indexed_item_count: Optional[str]
Total count of files that have been indexed
removed_item_count: Optional[str]
Total count of files that have been removed
skipped_item_count: Optional[str]
Total count of files that have been skipped
started_at: Optional[datetime]
Timestamp when data source started indexing
status: Optional[Literal["DATA_SOURCE_STATUS_UNKNOWN", "DATA_SOURCE_STATUS_IN_PROGRESS", "DATA_SOURCE_STATUS_UPDATED", 3 more]]
total_bytes: Optional[str]
Total size of files in data source in bytes
total_bytes_indexed: Optional[str]
Total size of files in data source in bytes that have been indexed
total_file_count: Optional[str]
Total file count in the data source
last_indexing_job: Optional[APIIndexingJob]
IndexingJob description
completed_datasources: Optional[int]
Number of datasources indexed completed
created_at: Optional[datetime]
Creation date / time
knowledge_base_uuid: Optional[str]
Knowledge base id
phase: Optional[Literal["BATCH_JOB_PHASE_UNKNOWN", "BATCH_JOB_PHASE_PENDING", "BATCH_JOB_PHASE_RUNNING", 4 more]]
status: Optional[Literal["INDEX_JOB_STATUS_UNKNOWN", "INDEX_JOB_STATUS_PARTIAL", "INDEX_JOB_STATUS_IN_PROGRESS", 4 more]]
tokens: Optional[int]
Number of tokens
total_datasources: Optional[int]
Number of datasources being indexed
total_items_failed: Optional[str]
Total Items Failed
total_items_indexed: Optional[str]
Total Items Indexed
total_items_skipped: Optional[str]
Total Items Skipped
updated_at: Optional[datetime]
Last modified
uuid: Optional[str]
Unique id
region: Optional[str]
Region code - Deprecated, moved to data_source_details
spaces_data_source: Optional[APISpacesDataSource]
Spaces Bucket Data Source
bucket_name: Optional[str]
Spaces bucket name
region: Optional[str]
Region of bucket
updated_at: Optional[datetime]
Last modified
uuid: Optional[str]
Unique id of knowledge base
web_crawler_data_source: Optional[APIWebCrawlerDataSource]
WebCrawlerDataSource
base_url: Optional[str]
The base url to crawl.
crawling_option: Optional[Literal["UNKNOWN", "SCOPED", "PATH", 2 more]]
Options for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
embed_media: Optional[bool]
Whether to ingest and index media (images, etc.) on web pages.
class APISpacesDataSource: …
Spaces Bucket Data Source
bucket_name: Optional[str]
Spaces bucket name
region: Optional[str]
Region of bucket
class APIWebCrawlerDataSource: …
WebCrawlerDataSource
base_url: Optional[str]
The base url to crawl.
crawling_option: Optional[Literal["UNKNOWN", "SCOPED", "PATH", 2 more]]
Options for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
embed_media: Optional[bool]
Whether to ingest and index media (images, etc.) on web pages.
class AwsDataSource: …
AWS S3 Data Source
bucket_name: Optional[str]
Spaces bucket name
key_id: Optional[str]
The AWS Key ID
region: Optional[str]
Region of bucket
secret_key: Optional[str]
The AWS Secret Key