Skip to content

Add Data Source to a Knowledge Base

client.KnowledgeBases.DataSources.New(ctx, knowledgeBaseUuid, body) (*KnowledgeBaseDataSourceNewResponse, error)
post/v2/gen-ai/knowledge_bases/{knowledge_base_uuid}/data_sources

To add a data source to a knowledge base, send a POST request to /v2/gen-ai/knowledge_bases/{knowledge_base_uuid}/data_sources.

ParametersExpand Collapse
knowledgeBaseUuid string
body KnowledgeBaseDataSourceNewParams
AwsDataSource param.Field[AwsDataSource]optional

AWS S3 Data Source

ChunkingAlgorithm param.Field[KnowledgeBaseDataSourceNewParamsChunkingAlgorithm]optional

The chunking algorithm to use for processing data sources.

Note: This feature requires enabling the knowledgebase enhancements feature preview flag.

const KnowledgeBaseDataSourceNewParamsChunkingAlgorithmChunkingAlgorithmUnknown KnowledgeBaseDataSourceNewParamsChunkingAlgorithm = "CHUNKING_ALGORITHM_UNKNOWN"
const KnowledgeBaseDataSourceNewParamsChunkingAlgorithmChunkingAlgorithmSectionBased KnowledgeBaseDataSourceNewParamsChunkingAlgorithm = "CHUNKING_ALGORITHM_SECTION_BASED"
const KnowledgeBaseDataSourceNewParamsChunkingAlgorithmChunkingAlgorithmHierarchical KnowledgeBaseDataSourceNewParamsChunkingAlgorithm = "CHUNKING_ALGORITHM_HIERARCHICAL"
const KnowledgeBaseDataSourceNewParamsChunkingAlgorithmChunkingAlgorithmSemantic KnowledgeBaseDataSourceNewParamsChunkingAlgorithm = "CHUNKING_ALGORITHM_SEMANTIC"
const KnowledgeBaseDataSourceNewParamsChunkingAlgorithmChunkingAlgorithmFixedLength KnowledgeBaseDataSourceNewParamsChunkingAlgorithm = "CHUNKING_ALGORITHM_FIXED_LENGTH"
ChunkingOptions param.Field[KnowledgeBaseDataSourceNewParamsChunkingOptions]optional

Configuration options for the chunking algorithm.

Note: This feature requires enabling the knowledgebase enhancements feature preview flag.

ChildChunkSize int64optional

Hierarchical options

formatint64
MaxChunkSize int64optional

Section_Based and Fixed_Length options

formatint64
ParentChunkSize int64optional

Hierarchical options

formatint64
SemanticThreshold float64optional

Semantic options

formatfloat
KnowledgeBaseUuid param.Field[string]optional

Knowledge base id

SpacesDataSource param.Field[APISpacesDataSource]optional

Spaces Bucket Data Source

WebCrawlerDataSource param.Field[APIWebCrawlerDataSource]optional

WebCrawlerDataSource

ReturnsExpand Collapse
type KnowledgeBaseDataSourceNewResponse struct{…}

Information about a newly created knowldege base data source

KnowledgeBaseDataSource APIKnowledgeBaseDataSourceoptional

Data Source configuration for Knowledge Bases

AwsDataSource APIKnowledgeBaseDataSourceAwsDataSourceoptional

AWS S3 Data Source for Display

BucketName stringoptional

Spaces bucket name

ItemPath stringoptional
Region stringoptional

Region of bucket

BucketName stringoptional

Name of storage bucket - Deprecated, moved to data_source_details

ChunkingAlgorithm APIKnowledgeBaseDataSourceChunkingAlgorithmoptional

The chunking algorithm to use for processing data sources.

Note: This feature requires enabling the knowledgebase enhancements feature preview flag.

Accepts one of the following:
const APIKnowledgeBaseDataSourceChunkingAlgorithmChunkingAlgorithmUnknown APIKnowledgeBaseDataSourceChunkingAlgorithm = "CHUNKING_ALGORITHM_UNKNOWN"
const APIKnowledgeBaseDataSourceChunkingAlgorithmChunkingAlgorithmSectionBased APIKnowledgeBaseDataSourceChunkingAlgorithm = "CHUNKING_ALGORITHM_SECTION_BASED"
const APIKnowledgeBaseDataSourceChunkingAlgorithmChunkingAlgorithmHierarchical APIKnowledgeBaseDataSourceChunkingAlgorithm = "CHUNKING_ALGORITHM_HIERARCHICAL"
const APIKnowledgeBaseDataSourceChunkingAlgorithmChunkingAlgorithmSemantic APIKnowledgeBaseDataSourceChunkingAlgorithm = "CHUNKING_ALGORITHM_SEMANTIC"
const APIKnowledgeBaseDataSourceChunkingAlgorithmChunkingAlgorithmFixedLength APIKnowledgeBaseDataSourceChunkingAlgorithm = "CHUNKING_ALGORITHM_FIXED_LENGTH"
ChunkingOptions APIKnowledgeBaseDataSourceChunkingOptionsoptional

Configuration options for the chunking algorithm.

Note: This feature requires enabling the knowledgebase enhancements feature preview flag.

ChildChunkSize int64optional

Hierarchical options

formatint64
MaxChunkSize int64optional

Section_Based and Fixed_Length options

formatint64
ParentChunkSize int64optional

Hierarchical options

formatint64
SemanticThreshold float64optional

Semantic options

formatfloat
CreatedAt Timeoptional

Creation date / time

formatdate-time
DropboxDataSource APIKnowledgeBaseDataSourceDropboxDataSourceoptional

Dropbox Data Source for Display

Folder stringoptional
FileUploadDataSource APIFileUploadDataSourceoptional

File to upload as data source for knowledge base.

OriginalFileName stringoptional

The original file name

SizeInBytes stringoptional

The size of the file in bytes

formatuint64
StoredObjectKey stringoptional

The object key the file was stored as

GoogleDriveDataSource APIKnowledgeBaseDataSourceGoogleDriveDataSourceoptional

Google Drive Data Source for Display

FolderID stringoptional
FolderName stringoptional

Name of the selected folder if available

ItemPath stringoptional

Path of folder or object in bucket - Deprecated, moved to data_source_details

LastDatasourceIndexingJob APIIndexedDataSourceoptional
CompletedAt Timeoptional

Timestamp when data source completed indexing

formatdate-time
DataSourceUuid stringoptional

Uuid of the indexed data source

ErrorDetails stringoptional

A detailed error description

ErrorMsg stringoptional

A string code provinding a hint which part of the system experienced an error

FailedItemCount stringoptional

Total count of files that have failed

formatuint64
IndexedFileCount stringoptional

Total count of files that have been indexed

formatuint64
IndexedItemCount stringoptional

Total count of files that have been indexed

formatuint64
RemovedItemCount stringoptional

Total count of files that have been removed

formatuint64
SkippedItemCount stringoptional

Total count of files that have been skipped

formatuint64
StartedAt Timeoptional

Timestamp when data source started indexing

formatdate-time
Status APIIndexedDataSourceStatusoptional
Accepts one of the following:
const APIIndexedDataSourceStatusDataSourceStatusUnknown APIIndexedDataSourceStatus = "DATA_SOURCE_STATUS_UNKNOWN"
const APIIndexedDataSourceStatusDataSourceStatusInProgress APIIndexedDataSourceStatus = "DATA_SOURCE_STATUS_IN_PROGRESS"
const APIIndexedDataSourceStatusDataSourceStatusUpdated APIIndexedDataSourceStatus = "DATA_SOURCE_STATUS_UPDATED"
const APIIndexedDataSourceStatusDataSourceStatusPartiallyUpdated APIIndexedDataSourceStatus = "DATA_SOURCE_STATUS_PARTIALLY_UPDATED"
const APIIndexedDataSourceStatusDataSourceStatusNotUpdated APIIndexedDataSourceStatus = "DATA_SOURCE_STATUS_NOT_UPDATED"
const APIIndexedDataSourceStatusDataSourceStatusFailed APIIndexedDataSourceStatus = "DATA_SOURCE_STATUS_FAILED"
const APIIndexedDataSourceStatusDataSourceStatusCancelled APIIndexedDataSourceStatus = "DATA_SOURCE_STATUS_CANCELLED"
TotalBytes stringoptional

Total size of files in data source in bytes

formatuint64
TotalBytesIndexed stringoptional

Total size of files in data source in bytes that have been indexed

formatuint64
TotalFileCount stringoptional

Total file count in the data source

formatuint64
Region stringoptional

Region code - Deprecated, moved to data_source_details

SpacesDataSource APISpacesDataSourceoptional

Spaces Bucket Data Source

BucketName stringoptional

Spaces bucket name

ItemPath stringoptional
Region stringoptional

Region of bucket

UpdatedAt Timeoptional

Last modified

formatdate-time
Uuid stringoptional

Unique id of knowledge base

WebCrawlerDataSource APIWebCrawlerDataSourceoptional

WebCrawlerDataSource

BaseURL stringoptional

The base url to crawl.

CrawlingOption APIWebCrawlerDataSourceCrawlingOptionoptional

Options for specifying how URLs found on pages should be handled.

  • UNKNOWN: Default unknown value
  • SCOPED: Only include the base URL.
  • PATH: Crawl the base URL and linked pages within the URL path.
  • DOMAIN: Crawl the base URL and linked pages within the same domain.
  • SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
  • SITEMAP: Crawl URLs discovered in the sitemap.
Accepts one of the following:
const APIWebCrawlerDataSourceCrawlingOptionUnknown APIWebCrawlerDataSourceCrawlingOption = "UNKNOWN"
const APIWebCrawlerDataSourceCrawlingOptionScoped APIWebCrawlerDataSourceCrawlingOption = "SCOPED"
const APIWebCrawlerDataSourceCrawlingOptionPath APIWebCrawlerDataSourceCrawlingOption = "PATH"
const APIWebCrawlerDataSourceCrawlingOptionDomain APIWebCrawlerDataSourceCrawlingOption = "DOMAIN"
const APIWebCrawlerDataSourceCrawlingOptionSubdomains APIWebCrawlerDataSourceCrawlingOption = "SUBDOMAINS"
const APIWebCrawlerDataSourceCrawlingOptionSitemap APIWebCrawlerDataSourceCrawlingOption = "SITEMAP"
EmbedMedia booloptional

Whether to ingest and index media (images, etc.) on web pages.

ExcludeTags []stringoptional

Declaring which tags to exclude in web pages while webcrawling

Add Data Source to a Knowledge Base
package main

import (
  "context"
  "fmt"

  "github.com/stainless-sdks/-go"
  "github.com/stainless-sdks/-go/option"
)

func main() {
  client := gradient.NewClient(
    option.WithAccessToken("My Access Token"),
  )
  dataSource, err := client.KnowledgeBases.DataSources.New(
    context.TODO(),
    `"123e4567-e89b-12d3-a456-426614174000"`,
    gradient.KnowledgeBaseDataSourceNewParams{

    },
  )
  if err != nil {
    panic(err.Error())
  }
  fmt.Printf("%+v\n", dataSource.KnowledgeBaseDataSource)
}
{
  "knowledge_base_data_source": {
    "aws_data_source": {
      "bucket_name": "example name",
      "item_path": "example string",
      "region": "example string"
    },
    "bucket_name": "example name",
    "chunking_algorithm": "CHUNKING_ALGORITHM_SECTION_BASED",
    "chunking_options": {
      "child_chunk_size": 350,
      "max_chunk_size": 750,
      "parent_chunk_size": 1000,
      "semantic_threshold": 0.5
    },
    "created_at": "2023-01-01T00:00:00Z",
    "dropbox_data_source": {
      "folder": "example string"
    },
    "file_upload_data_source": {
      "original_file_name": "example name",
      "size_in_bytes": "12345",
      "stored_object_key": "example string"
    },
    "google_drive_data_source": {
      "folder_id": "123e4567-e89b-12d3-a456-426614174000",
      "folder_name": "example name"
    },
    "item_path": "example string",
    "last_datasource_indexing_job": {
      "completed_at": "2023-01-01T00:00:00Z",
      "data_source_uuid": "123e4567-e89b-12d3-a456-426614174000",
      "error_details": "example string",
      "error_msg": "example string",
      "failed_item_count": "12345",
      "indexed_file_count": "12345",
      "indexed_item_count": "12345",
      "removed_item_count": "12345",
      "skipped_item_count": "12345",
      "started_at": "2023-01-01T00:00:00Z",
      "status": "DATA_SOURCE_STATUS_UNKNOWN",
      "total_bytes": "12345",
      "total_bytes_indexed": "12345",
      "total_file_count": "12345"
    },
    "region": "example string",
    "spaces_data_source": {
      "bucket_name": "example name",
      "item_path": "example string",
      "region": "example string"
    },
    "updated_at": "2023-01-01T00:00:00Z",
    "uuid": "123e4567-e89b-12d3-a456-426614174000",
    "web_crawler_data_source": {
      "base_url": "example string",
      "crawling_option": "UNKNOWN",
      "embed_media": true,
      "exclude_tags": [
        "example string"
      ]
    }
  }
}
Returns Examples
{
  "knowledge_base_data_source": {
    "aws_data_source": {
      "bucket_name": "example name",
      "item_path": "example string",
      "region": "example string"
    },
    "bucket_name": "example name",
    "chunking_algorithm": "CHUNKING_ALGORITHM_SECTION_BASED",
    "chunking_options": {
      "child_chunk_size": 350,
      "max_chunk_size": 750,
      "parent_chunk_size": 1000,
      "semantic_threshold": 0.5
    },
    "created_at": "2023-01-01T00:00:00Z",
    "dropbox_data_source": {
      "folder": "example string"
    },
    "file_upload_data_source": {
      "original_file_name": "example name",
      "size_in_bytes": "12345",
      "stored_object_key": "example string"
    },
    "google_drive_data_source": {
      "folder_id": "123e4567-e89b-12d3-a456-426614174000",
      "folder_name": "example name"
    },
    "item_path": "example string",
    "last_datasource_indexing_job": {
      "completed_at": "2023-01-01T00:00:00Z",
      "data_source_uuid": "123e4567-e89b-12d3-a456-426614174000",
      "error_details": "example string",
      "error_msg": "example string",
      "failed_item_count": "12345",
      "indexed_file_count": "12345",
      "indexed_item_count": "12345",
      "removed_item_count": "12345",
      "skipped_item_count": "12345",
      "started_at": "2023-01-01T00:00:00Z",
      "status": "DATA_SOURCE_STATUS_UNKNOWN",
      "total_bytes": "12345",
      "total_bytes_indexed": "12345",
      "total_file_count": "12345"
    },
    "region": "example string",
    "spaces_data_source": {
      "bucket_name": "example name",
      "item_path": "example string",
      "region": "example string"
    },
    "updated_at": "2023-01-01T00:00:00Z",
    "uuid": "123e4567-e89b-12d3-a456-426614174000",
    "web_crawler_data_source": {
      "base_url": "example string",
      "crawling_option": "UNKNOWN",
      "embed_media": true,
      "exclude_tags": [
        "example string"
      ]
    }
  }
}