# Data Sources ## Create `client.knowledgeBases.dataSources.create(stringknowledgeBaseUuid, DataSourceCreateParamsbody?, RequestOptionsoptions?): DataSourceCreateResponse` **post** `/v2/gen-ai/knowledge_bases/{knowledge_base_uuid}/data_sources` To add a data source to a knowledge base, send a POST request to `/v2/gen-ai/knowledge_bases/{knowledge_base_uuid}/data_sources`. ### Parameters - `knowledgeBaseUuid: string` - `body: DataSourceCreateParams` - `aws_data_source?: AwsDataSource` AWS S3 Data Source - `bucket_name?: string` Spaces bucket name - `item_path?: string` - `key_id?: string` The AWS Key ID - `region?: string` Region of bucket - `secret_key?: string` The AWS Secret Key - `knowledge_base_uuid?: string` Knowledge base id - `spaces_data_source?: APISpacesDataSource` Spaces Bucket Data Source - `bucket_name?: string` Spaces bucket name - `item_path?: string` - `region?: string` Region of bucket - `web_crawler_data_source?: APIWebCrawlerDataSource` WebCrawlerDataSource - `base_url?: string` The base url to crawl. - `crawling_option?: "UNKNOWN" | "SCOPED" | "PATH" | 2 more` Options for specifying how URLs found on pages should be handled. - UNKNOWN: Default unknown value - SCOPED: Only include the base URL. - PATH: Crawl the base URL and linked pages within the URL path. - DOMAIN: Crawl the base URL and linked pages within the same domain. - SUBDOMAINS: Crawl the base URL and linked pages for any subdomain. - `"UNKNOWN"` - `"SCOPED"` - `"PATH"` - `"DOMAIN"` - `"SUBDOMAINS"` - `embed_media?: boolean` Whether to ingest and index media (images, etc.) on web pages. ### Returns - `DataSourceCreateResponse` Information about a newly created knowldege base data source - `knowledge_base_data_source?: APIKnowledgeBaseDataSource` Data Source configuration for Knowledge Bases - `aws_data_source?: AwsDataSource` AWS S3 Data Source for Display - `bucket_name?: string` Spaces bucket name - `item_path?: string` - `region?: string` Region of bucket - `bucket_name?: string` Name of storage bucket - Deprecated, moved to data_source_details - `created_at?: string` Creation date / time - `dropbox_data_source?: DropboxDataSource` Dropbox Data Source for Display - `folder?: string` - `file_upload_data_source?: APIFileUploadDataSource` File to upload as data source for knowledge base. - `original_file_name?: string` The original file name - `size_in_bytes?: string` The size of the file in bytes - `stored_object_key?: string` The object key the file was stored as - `item_path?: string` Path of folder or object in bucket - Deprecated, moved to data_source_details - `last_datasource_indexing_job?: APIIndexedDataSource` - `completed_at?: string` Timestamp when data source completed indexing - `data_source_uuid?: string` Uuid of the indexed data source - `error_details?: string` A detailed error description - `error_msg?: string` A string code provinding a hint which part of the system experienced an error - `failed_item_count?: string` Total count of files that have failed - `indexed_file_count?: string` Total count of files that have been indexed - `indexed_item_count?: string` Total count of files that have been indexed - `removed_item_count?: string` Total count of files that have been removed - `skipped_item_count?: string` Total count of files that have been skipped - `started_at?: string` Timestamp when data source started indexing - `status?: "DATA_SOURCE_STATUS_UNKNOWN" | "DATA_SOURCE_STATUS_IN_PROGRESS" | "DATA_SOURCE_STATUS_UPDATED" | 3 more` - `"DATA_SOURCE_STATUS_UNKNOWN"` - `"DATA_SOURCE_STATUS_IN_PROGRESS"` - `"DATA_SOURCE_STATUS_UPDATED"` - `"DATA_SOURCE_STATUS_PARTIALLY_UPDATED"` - `"DATA_SOURCE_STATUS_NOT_UPDATED"` - `"DATA_SOURCE_STATUS_FAILED"` - `total_bytes?: string` Total size of files in data source in bytes - `total_bytes_indexed?: string` Total size of files in data source in bytes that have been indexed - `total_file_count?: string` Total file count in the data source - `last_indexing_job?: APIIndexingJob` IndexingJob description - `completed_datasources?: number` Number of datasources indexed completed - `created_at?: string` Creation date / time - `data_source_uuids?: Array` - `finished_at?: string` - `knowledge_base_uuid?: string` Knowledge base id - `phase?: "BATCH_JOB_PHASE_UNKNOWN" | "BATCH_JOB_PHASE_PENDING" | "BATCH_JOB_PHASE_RUNNING" | 4 more` - `"BATCH_JOB_PHASE_UNKNOWN"` - `"BATCH_JOB_PHASE_PENDING"` - `"BATCH_JOB_PHASE_RUNNING"` - `"BATCH_JOB_PHASE_SUCCEEDED"` - `"BATCH_JOB_PHASE_FAILED"` - `"BATCH_JOB_PHASE_ERROR"` - `"BATCH_JOB_PHASE_CANCELLED"` - `started_at?: string` - `status?: "INDEX_JOB_STATUS_UNKNOWN" | "INDEX_JOB_STATUS_PARTIAL" | "INDEX_JOB_STATUS_IN_PROGRESS" | 4 more` - `"INDEX_JOB_STATUS_UNKNOWN"` - `"INDEX_JOB_STATUS_PARTIAL"` - `"INDEX_JOB_STATUS_IN_PROGRESS"` - `"INDEX_JOB_STATUS_COMPLETED"` - `"INDEX_JOB_STATUS_FAILED"` - `"INDEX_JOB_STATUS_NO_CHANGES"` - `"INDEX_JOB_STATUS_PENDING"` - `tokens?: number` Number of tokens - `total_datasources?: number` Number of datasources being indexed - `total_items_failed?: string` Total Items Failed - `total_items_indexed?: string` Total Items Indexed - `total_items_skipped?: string` Total Items Skipped - `updated_at?: string` Last modified - `uuid?: string` Unique id - `region?: string` Region code - Deprecated, moved to data_source_details - `spaces_data_source?: APISpacesDataSource` Spaces Bucket Data Source - `bucket_name?: string` Spaces bucket name - `item_path?: string` - `region?: string` Region of bucket - `updated_at?: string` Last modified - `uuid?: string` Unique id of knowledge base - `web_crawler_data_source?: APIWebCrawlerDataSource` WebCrawlerDataSource - `base_url?: string` The base url to crawl. - `crawling_option?: "UNKNOWN" | "SCOPED" | "PATH" | 2 more` Options for specifying how URLs found on pages should be handled. - UNKNOWN: Default unknown value - SCOPED: Only include the base URL. - PATH: Crawl the base URL and linked pages within the URL path. - DOMAIN: Crawl the base URL and linked pages within the same domain. - SUBDOMAINS: Crawl the base URL and linked pages for any subdomain. - `"UNKNOWN"` - `"SCOPED"` - `"PATH"` - `"DOMAIN"` - `"SUBDOMAINS"` - `embed_media?: boolean` Whether to ingest and index media (images, etc.) on web pages. ### Example ```typescript import Gradient from '@digitalocean/gradient'; const client = new Gradient(); const dataSource = await client.knowledgeBases.dataSources.create('"123e4567-e89b-12d3-a456-426614174000"'); console.log(dataSource.knowledge_base_data_source); ``` ## List `client.knowledgeBases.dataSources.list(stringknowledgeBaseUuid, DataSourceListParamsquery?, RequestOptionsoptions?): DataSourceListResponse` **get** `/v2/gen-ai/knowledge_bases/{knowledge_base_uuid}/data_sources` To list all data sources for a knowledge base, send a GET request to `/v2/gen-ai/knowledge_bases/{knowledge_base_uuid}/data_sources`. ### Parameters - `knowledgeBaseUuid: string` - `query: DataSourceListParams` - `page?: number` Page number. - `per_page?: number` Items per page. ### Returns - `DataSourceListResponse` A list of knowledge base data sources - `knowledge_base_data_sources?: Array` The data sources - `aws_data_source?: AwsDataSource` AWS S3 Data Source for Display - `bucket_name?: string` Spaces bucket name - `item_path?: string` - `region?: string` Region of bucket - `bucket_name?: string` Name of storage bucket - Deprecated, moved to data_source_details - `created_at?: string` Creation date / time - `dropbox_data_source?: DropboxDataSource` Dropbox Data Source for Display - `folder?: string` - `file_upload_data_source?: APIFileUploadDataSource` File to upload as data source for knowledge base. - `original_file_name?: string` The original file name - `size_in_bytes?: string` The size of the file in bytes - `stored_object_key?: string` The object key the file was stored as - `item_path?: string` Path of folder or object in bucket - Deprecated, moved to data_source_details - `last_datasource_indexing_job?: APIIndexedDataSource` - `completed_at?: string` Timestamp when data source completed indexing - `data_source_uuid?: string` Uuid of the indexed data source - `error_details?: string` A detailed error description - `error_msg?: string` A string code provinding a hint which part of the system experienced an error - `failed_item_count?: string` Total count of files that have failed - `indexed_file_count?: string` Total count of files that have been indexed - `indexed_item_count?: string` Total count of files that have been indexed - `removed_item_count?: string` Total count of files that have been removed - `skipped_item_count?: string` Total count of files that have been skipped - `started_at?: string` Timestamp when data source started indexing - `status?: "DATA_SOURCE_STATUS_UNKNOWN" | "DATA_SOURCE_STATUS_IN_PROGRESS" | "DATA_SOURCE_STATUS_UPDATED" | 3 more` - `"DATA_SOURCE_STATUS_UNKNOWN"` - `"DATA_SOURCE_STATUS_IN_PROGRESS"` - `"DATA_SOURCE_STATUS_UPDATED"` - `"DATA_SOURCE_STATUS_PARTIALLY_UPDATED"` - `"DATA_SOURCE_STATUS_NOT_UPDATED"` - `"DATA_SOURCE_STATUS_FAILED"` - `total_bytes?: string` Total size of files in data source in bytes - `total_bytes_indexed?: string` Total size of files in data source in bytes that have been indexed - `total_file_count?: string` Total file count in the data source - `last_indexing_job?: APIIndexingJob` IndexingJob description - `completed_datasources?: number` Number of datasources indexed completed - `created_at?: string` Creation date / time - `data_source_uuids?: Array` - `finished_at?: string` - `knowledge_base_uuid?: string` Knowledge base id - `phase?: "BATCH_JOB_PHASE_UNKNOWN" | "BATCH_JOB_PHASE_PENDING" | "BATCH_JOB_PHASE_RUNNING" | 4 more` - `"BATCH_JOB_PHASE_UNKNOWN"` - `"BATCH_JOB_PHASE_PENDING"` - `"BATCH_JOB_PHASE_RUNNING"` - `"BATCH_JOB_PHASE_SUCCEEDED"` - `"BATCH_JOB_PHASE_FAILED"` - `"BATCH_JOB_PHASE_ERROR"` - `"BATCH_JOB_PHASE_CANCELLED"` - `started_at?: string` - `status?: "INDEX_JOB_STATUS_UNKNOWN" | "INDEX_JOB_STATUS_PARTIAL" | "INDEX_JOB_STATUS_IN_PROGRESS" | 4 more` - `"INDEX_JOB_STATUS_UNKNOWN"` - `"INDEX_JOB_STATUS_PARTIAL"` - `"INDEX_JOB_STATUS_IN_PROGRESS"` - `"INDEX_JOB_STATUS_COMPLETED"` - `"INDEX_JOB_STATUS_FAILED"` - `"INDEX_JOB_STATUS_NO_CHANGES"` - `"INDEX_JOB_STATUS_PENDING"` - `tokens?: number` Number of tokens - `total_datasources?: number` Number of datasources being indexed - `total_items_failed?: string` Total Items Failed - `total_items_indexed?: string` Total Items Indexed - `total_items_skipped?: string` Total Items Skipped - `updated_at?: string` Last modified - `uuid?: string` Unique id - `region?: string` Region code - Deprecated, moved to data_source_details - `spaces_data_source?: APISpacesDataSource` Spaces Bucket Data Source - `bucket_name?: string` Spaces bucket name - `item_path?: string` - `region?: string` Region of bucket - `updated_at?: string` Last modified - `uuid?: string` Unique id of knowledge base - `web_crawler_data_source?: APIWebCrawlerDataSource` WebCrawlerDataSource - `base_url?: string` The base url to crawl. - `crawling_option?: "UNKNOWN" | "SCOPED" | "PATH" | 2 more` Options for specifying how URLs found on pages should be handled. - UNKNOWN: Default unknown value - SCOPED: Only include the base URL. - PATH: Crawl the base URL and linked pages within the URL path. - DOMAIN: Crawl the base URL and linked pages within the same domain. - SUBDOMAINS: Crawl the base URL and linked pages for any subdomain. - `"UNKNOWN"` - `"SCOPED"` - `"PATH"` - `"DOMAIN"` - `"SUBDOMAINS"` - `embed_media?: boolean` Whether to ingest and index media (images, etc.) on web pages. - `links?: APILinks` Links to other pages - `pages?: Pages` Information about how to reach other pages - `first?: string` First page - `last?: string` Last page - `next?: string` Next page - `previous?: string` Previous page - `meta?: APIMeta` Meta information about the data set - `page?: number` The current page - `pages?: number` Total number of pages - `total?: number` Total amount of items over all pages ### Example ```typescript import Gradient from '@digitalocean/gradient'; const client = new Gradient(); const dataSources = await client.knowledgeBases.dataSources.list('"123e4567-e89b-12d3-a456-426614174000"'); console.log(dataSources.knowledge_base_data_sources); ``` ## Delete `client.knowledgeBases.dataSources.delete(stringdataSourceUuid, DataSourceDeleteParamsparams, RequestOptionsoptions?): DataSourceDeleteResponse` **delete** `/v2/gen-ai/knowledge_bases/{knowledge_base_uuid}/data_sources/{data_source_uuid}` To delete a data source from a knowledge base, send a DELETE request to `/v2/gen-ai/knowledge_bases/{knowledge_base_uuid}/data_sources/{data_source_uuid}`. ### Parameters - `dataSourceUuid: string` - `params: DataSourceDeleteParams` - `knowledge_base_uuid: string` Knowledge base id ### Returns - `DataSourceDeleteResponse` Information about a newly deleted knowledge base data source - `data_source_uuid?: string` Data source id - `knowledge_base_uuid?: string` Knowledge base id ### Example ```typescript import Gradient from '@digitalocean/gradient'; const client = new Gradient(); const dataSource = await client.knowledgeBases.dataSources.delete('"123e4567-e89b-12d3-a456-426614174000"', { knowledge_base_uuid: '"123e4567-e89b-12d3-a456-426614174000"', }); console.log(dataSource.data_source_uuid); ``` ## Create Presigned URLs `client.knowledgeBases.dataSources.createPresignedURLs(DataSourceCreatePresignedURLsParamsbody?, RequestOptionsoptions?): DataSourceCreatePresignedURLsResponse` **post** `/v2/gen-ai/knowledge_bases/data_sources/file_upload_presigned_urls` To create presigned URLs for knowledge base data source file upload, send a POST request to `/v2/gen-ai/knowledge_bases/data_sources/file_upload_presigned_urls`. ### Parameters - `body: DataSourceCreatePresignedURLsParams` - `files?: Array` A list of files to generate presigned URLs for. - `file_name?: string` Local filename - `file_size?: string` The size of the file in bytes. ### Returns - `DataSourceCreatePresignedURLsResponse` Response with pre-signed urls to upload files. - `request_id?: string` The ID generated for the request for Presigned URLs. - `uploads?: Array` A list of generated presigned URLs and object keys, one per file. - `expires_at?: string` The time the url expires at. - `object_key?: string` The unique object key to store the file as. - `original_file_name?: string` The original file name. - `presigned_url?: string` The actual presigned URL the client can use to upload the file directly. ### Example ```typescript import Gradient from '@digitalocean/gradient'; const client = new Gradient(); const response = await client.knowledgeBases.dataSources.createPresignedURLs(); console.log(response.request_id); ``` ## Domain Types ### API File Upload Data Source - `APIFileUploadDataSource` File to upload as data source for knowledge base. - `original_file_name?: string` The original file name - `size_in_bytes?: string` The size of the file in bytes - `stored_object_key?: string` The object key the file was stored as ### API Knowledge Base Data Source - `APIKnowledgeBaseDataSource` Data Source configuration for Knowledge Bases - `aws_data_source?: AwsDataSource` AWS S3 Data Source for Display - `bucket_name?: string` Spaces bucket name - `item_path?: string` - `region?: string` Region of bucket - `bucket_name?: string` Name of storage bucket - Deprecated, moved to data_source_details - `created_at?: string` Creation date / time - `dropbox_data_source?: DropboxDataSource` Dropbox Data Source for Display - `folder?: string` - `file_upload_data_source?: APIFileUploadDataSource` File to upload as data source for knowledge base. - `original_file_name?: string` The original file name - `size_in_bytes?: string` The size of the file in bytes - `stored_object_key?: string` The object key the file was stored as - `item_path?: string` Path of folder or object in bucket - Deprecated, moved to data_source_details - `last_datasource_indexing_job?: APIIndexedDataSource` - `completed_at?: string` Timestamp when data source completed indexing - `data_source_uuid?: string` Uuid of the indexed data source - `error_details?: string` A detailed error description - `error_msg?: string` A string code provinding a hint which part of the system experienced an error - `failed_item_count?: string` Total count of files that have failed - `indexed_file_count?: string` Total count of files that have been indexed - `indexed_item_count?: string` Total count of files that have been indexed - `removed_item_count?: string` Total count of files that have been removed - `skipped_item_count?: string` Total count of files that have been skipped - `started_at?: string` Timestamp when data source started indexing - `status?: "DATA_SOURCE_STATUS_UNKNOWN" | "DATA_SOURCE_STATUS_IN_PROGRESS" | "DATA_SOURCE_STATUS_UPDATED" | 3 more` - `"DATA_SOURCE_STATUS_UNKNOWN"` - `"DATA_SOURCE_STATUS_IN_PROGRESS"` - `"DATA_SOURCE_STATUS_UPDATED"` - `"DATA_SOURCE_STATUS_PARTIALLY_UPDATED"` - `"DATA_SOURCE_STATUS_NOT_UPDATED"` - `"DATA_SOURCE_STATUS_FAILED"` - `total_bytes?: string` Total size of files in data source in bytes - `total_bytes_indexed?: string` Total size of files in data source in bytes that have been indexed - `total_file_count?: string` Total file count in the data source - `last_indexing_job?: APIIndexingJob` IndexingJob description - `completed_datasources?: number` Number of datasources indexed completed - `created_at?: string` Creation date / time - `data_source_uuids?: Array` - `finished_at?: string` - `knowledge_base_uuid?: string` Knowledge base id - `phase?: "BATCH_JOB_PHASE_UNKNOWN" | "BATCH_JOB_PHASE_PENDING" | "BATCH_JOB_PHASE_RUNNING" | 4 more` - `"BATCH_JOB_PHASE_UNKNOWN"` - `"BATCH_JOB_PHASE_PENDING"` - `"BATCH_JOB_PHASE_RUNNING"` - `"BATCH_JOB_PHASE_SUCCEEDED"` - `"BATCH_JOB_PHASE_FAILED"` - `"BATCH_JOB_PHASE_ERROR"` - `"BATCH_JOB_PHASE_CANCELLED"` - `started_at?: string` - `status?: "INDEX_JOB_STATUS_UNKNOWN" | "INDEX_JOB_STATUS_PARTIAL" | "INDEX_JOB_STATUS_IN_PROGRESS" | 4 more` - `"INDEX_JOB_STATUS_UNKNOWN"` - `"INDEX_JOB_STATUS_PARTIAL"` - `"INDEX_JOB_STATUS_IN_PROGRESS"` - `"INDEX_JOB_STATUS_COMPLETED"` - `"INDEX_JOB_STATUS_FAILED"` - `"INDEX_JOB_STATUS_NO_CHANGES"` - `"INDEX_JOB_STATUS_PENDING"` - `tokens?: number` Number of tokens - `total_datasources?: number` Number of datasources being indexed - `total_items_failed?: string` Total Items Failed - `total_items_indexed?: string` Total Items Indexed - `total_items_skipped?: string` Total Items Skipped - `updated_at?: string` Last modified - `uuid?: string` Unique id - `region?: string` Region code - Deprecated, moved to data_source_details - `spaces_data_source?: APISpacesDataSource` Spaces Bucket Data Source - `bucket_name?: string` Spaces bucket name - `item_path?: string` - `region?: string` Region of bucket - `updated_at?: string` Last modified - `uuid?: string` Unique id of knowledge base - `web_crawler_data_source?: APIWebCrawlerDataSource` WebCrawlerDataSource - `base_url?: string` The base url to crawl. - `crawling_option?: "UNKNOWN" | "SCOPED" | "PATH" | 2 more` Options for specifying how URLs found on pages should be handled. - UNKNOWN: Default unknown value - SCOPED: Only include the base URL. - PATH: Crawl the base URL and linked pages within the URL path. - DOMAIN: Crawl the base URL and linked pages within the same domain. - SUBDOMAINS: Crawl the base URL and linked pages for any subdomain. - `"UNKNOWN"` - `"SCOPED"` - `"PATH"` - `"DOMAIN"` - `"SUBDOMAINS"` - `embed_media?: boolean` Whether to ingest and index media (images, etc.) on web pages. ### API Spaces Data Source - `APISpacesDataSource` Spaces Bucket Data Source - `bucket_name?: string` Spaces bucket name - `item_path?: string` - `region?: string` Region of bucket ### API Web Crawler Data Source - `APIWebCrawlerDataSource` WebCrawlerDataSource - `base_url?: string` The base url to crawl. - `crawling_option?: "UNKNOWN" | "SCOPED" | "PATH" | 2 more` Options for specifying how URLs found on pages should be handled. - UNKNOWN: Default unknown value - SCOPED: Only include the base URL. - PATH: Crawl the base URL and linked pages within the URL path. - DOMAIN: Crawl the base URL and linked pages within the same domain. - SUBDOMAINS: Crawl the base URL and linked pages for any subdomain. - `"UNKNOWN"` - `"SCOPED"` - `"PATH"` - `"DOMAIN"` - `"SUBDOMAINS"` - `embed_media?: boolean` Whether to ingest and index media (images, etc.) on web pages. ### Aws Data Source - `AwsDataSource` AWS S3 Data Source - `bucket_name?: string` Spaces bucket name - `item_path?: string` - `key_id?: string` The AWS Key ID - `region?: string` Region of bucket - `secret_key?: string` The AWS Secret Key