- Installation
- Streaming Support
- Getting Started
-
Reference
humanloop.chat
humanloop.chatDeployed
humanloop.chatExperiment
humanloop.chatModelConfig
humanloop.complete
humanloop.completeDeployed
humanloop.completeExperiment
humanloop.completeModelConfiguration
humanloop.datapoints.delete
humanloop.datapoints.get
humanloop.datapoints.update
humanloop.datasets.create
humanloop.datasets.createDatapoint
humanloop.datasets.delete
humanloop.datasets.get
humanloop.datasets.list
humanloop.datasets.listAllForProject
humanloop.datasets.listDatapoints
humanloop.datasets.update
humanloop.evaluations.addEvaluators
humanloop.evaluations.create
humanloop.evaluations.get
humanloop.evaluations.list
humanloop.evaluations.listAllForProject
humanloop.evaluations.listDatapoints
humanloop.evaluations.log
humanloop.evaluations.result
humanloop.evaluations.updateStatus
humanloop.evaluators.create
humanloop.evaluators.delete
humanloop.evaluators.get
humanloop.evaluators.list
humanloop.evaluators.update
humanloop.experiments.create
humanloop.experiments.delete
humanloop.experiments.list
humanloop.experiments.sample
humanloop.experiments.update
humanloop.feedback
humanloop.logs.delete
humanloop.logs.get
humanloop.logs.list
humanloop.log
humanloop.logs.update
humanloop.logs.updateByRef
humanloop.modelConfigs.deserialize
humanloop.modelConfigs.export
humanloop.modelConfigs.get
humanloop.modelConfigs.register
humanloop.modelConfigs.serialize
humanloop.projects.create
humanloop.projects.createFeedbackType
humanloop.projects.deactivateConfig
humanloop.projects.deactivateExperiment
humanloop.projects.delete
humanloop.projects.deleteDeployedConfig
humanloop.projects.deployConfig
humanloop.projects.export
humanloop.projects.get
humanloop.projects.getActiveConfig
humanloop.projects.list
humanloop.projects.listConfigs
humanloop.projects.listDeployedConfigs
humanloop.projects.update
humanloop.projects.updateFeedbackTypes
humanloop.sessions.create
humanloop.sessions.get
humanloop.sessions.list
npm |
pnpm |
yarn |
---|---|---|
npm i humanloop |
pnpm i humanloop |
yarn add humanloop |
This SDK supports streaming, see example usage in a NextJS application here
import { Humanloop } from "humanloop";
const humanloop = new Humanloop({
// Defining the base path is optional and defaults to https://api.humanloop.com/v4
// basePath: "https://api.humanloop.com/v4",
openaiApiKey: "openaiApiKey",
anthropicApiKey: "anthropicApiKey",
apiKey: "API_KEY",
});
const chatResponse = await humanloop.chat({
project: "sdk-example",
messages: [
{
role: "user",
content: "Explain asynchronous programming.",
},
],
model_config: {
model: "gpt-3.5-turbo",
max_tokens: -1,
temperature: 0.7,
chat_template: [
{
role: "system",
content:
"You are a helpful assistant who replies in the style of {{persona}}.",
},
],
},
inputs: {
persona: "the pirate Blackbeard",
},
stream: false,
});
console.log(chatResponse);
const completeResponse = await humanloop.complete({
project: "sdk-example",
inputs: {
text: "Llamas that are well-socialized and trained to halter and lead after weaning and are very friendly and pleasant to be around. They are extremely curious and most will approach people easily. However, llamas that are bottle-fed or over-socialized and over-handled as youth will become extremely difficult to handle when mature, when they will begin to treat humans as they treat each other, which is characterized by bouts of spitting, kicking and neck wrestling.[33]",
},
model_config: {
model: "gpt-3.5-turbo",
max_tokens: -1,
temperature: 0.7,
prompt_template:
"Summarize this for a second-grade student:\n\nText:\n{{text}}\n\nSummary:\n",
},
stream: false,
});
console.log(completeResponse);
const feedbackResponse = await humanloop.feedback({
type: "rating",
value: "good",
data_id: "data_[...]",
user: "user@example.com",
});
console.log(feedbackResponse);
const logResponse = await humanloop.log({
project: "sdk-example",
inputs: {
text: "Llamas that are well-socialized and trained to halter and lead after weaning and are very friendly and pleasant to be around. They are extremely curious and most will approach people easily. However, llamas that are bottle-fed or over-socialized and over-handled as youth will become extremely difficult to handle when mature, when they will begin to treat humans as they treat each other, which is characterized by bouts of spitting, kicking and neck wrestling.[33]",
},
output:
"Llamas can be friendly and curious if they are trained to be around people, but if they are treated too much like pets when they are young, they can become difficult to handle when they grow up. This means they might spit, kick, and wrestle with their necks.",
source: "sdk",
config: {
model: "gpt-3.5-turbo",
max_tokens: -1,
temperature: 0.7,
prompt_template:
"Summarize this for a second-grade student:\n\nText:\n{{text}}\n\nSummary:\n",
type: "model",
},
});
console.log(logResponse);
Get a chat response by providing details of the model configuration in the request.
const createResponse = await humanloop.chat({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
messages: [
{
role: "user",
},
],
model_config: {
provider: "openai",
model: "model_example",
max_tokens: -1,
temperature: 1,
top_p: 1,
presence_penalty: 0,
frequency_penalty: 0,
endpoint: "complete",
},
});
messages: ChatMessageWithToolCall
[]
The messages passed to the to provider chat endpoint.
model_config: ModelConfigChatRequest
The model configuration used to create a chat response.
Unique project name. If no project exists with this name, a new project will be created.
Unique ID of a project to associate to the log. Either this or project
must be provided.
ID of the session to associate the datapoint.
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
ID associated to the parent datapoint in a session.
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
The inputs passed to the prompt template.
Identifies where the model was called from.
Any additional metadata to record.
Whether the request/response payloads will be stored on Humanloop.
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
The number of generations.
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
End-user ID passed through to provider call.
Deprecated field: the seed is instead set as part of the request.config object.
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
tool_choice: ToolChoiceProperty
tool_call: ToolCallProperty
response_format: ResponseFormat
The format of the response. Only type json_object is currently supported for chat.
/chat
POST
Get a chat response using the project's active deployment.
The active deployment can be a specific model configuration or an experiment.
const createDeployedResponse = await humanloop.chatDeployed({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
messages: [
{
role: "user",
},
],
});
messages: ChatMessageWithToolCall
[]
The messages passed to the to provider chat endpoint.
Unique project name. If no project exists with this name, a new project will be created.
Unique ID of a project to associate to the log. Either this or project
must be provided.
ID of the session to associate the datapoint.
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
ID associated to the parent datapoint in a session.
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
The inputs passed to the prompt template.
Identifies where the model was called from.
Any additional metadata to record.
Whether the request/response payloads will be stored on Humanloop.
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
The number of generations.
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
End-user ID passed through to provider call.
Deprecated field: the seed is instead set as part of the request.config object.
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
tool_choice: ToolChoiceProperty
tool_call: ToolCallProperty
response_format: ResponseFormat
The format of the response. Only type json_object is currently supported for chat.
The environment name used to create a chat response. If not specified, the default environment will be used.
/chat-deployed
POST
Get a chat response for a specific experiment.
const createExperimentResponse = await humanloop.chatExperiment({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
messages: [
{
role: "user",
},
],
experiment_id: "experiment_id_example",
});
messages: ChatMessageWithToolCall
[]
The messages passed to the to provider chat endpoint.
If an experiment ID is provided a model configuration will be sampled from the experiments active model configurations.
Unique project name. If no project exists with this name, a new project will be created.
Unique ID of a project to associate to the log. Either this or project
must be provided.
ID of the session to associate the datapoint.
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
ID associated to the parent datapoint in a session.
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
The inputs passed to the prompt template.
Identifies where the model was called from.
Any additional metadata to record.
Whether the request/response payloads will be stored on Humanloop.
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
The number of chat responses, where each chat response will use a model configuration sampled from the experiment.
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
End-user ID passed through to provider call.
Deprecated field: the seed is instead set as part of the request.config object.
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
tool_choice: ToolChoiceProperty
tool_call: ToolCallProperty
response_format: ResponseFormat
The format of the response. Only type json_object is currently supported for chat.
/chat-experiment
POST
Get chat response for a specific model configuration.
const createModelConfigResponse = await humanloop.chatModelConfig({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
messages: [
{
role: "user",
},
],
model_config_id: "model_config_id_example",
});
messages: ChatMessageWithToolCall
[]
The messages passed to the to provider chat endpoint.
Identifies the model configuration used to create a chat response.
Unique project name. If no project exists with this name, a new project will be created.
Unique ID of a project to associate to the log. Either this or project
must be provided.
ID of the session to associate the datapoint.
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
ID associated to the parent datapoint in a session.
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
The inputs passed to the prompt template.
Identifies where the model was called from.
Any additional metadata to record.
Whether the request/response payloads will be stored on Humanloop.
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
The number of generations.
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
End-user ID passed through to provider call.
Deprecated field: the seed is instead set as part of the request.config object.
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
tool_choice: ToolChoiceProperty
tool_call: ToolCallProperty
response_format: ResponseFormat
The format of the response. Only type json_object is currently supported for chat.
/chat-model-config
POST
Create a completion by providing details of the model configuration in the request.
const createResponse = await humanloop.complete({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
model_config: {
provider: "openai",
model: "model_example",
max_tokens: -1,
temperature: 1,
top_p: 1,
presence_penalty: 0,
frequency_penalty: 0,
endpoint: "complete",
prompt_template: "{{question}}",
},
});
model_config: ModelConfigCompletionRequest
The model configuration used to generate.
Unique project name. If no project exists with this name, a new project will be created.
Unique ID of a project to associate to the log. Either this or project
must be provided.
ID of the session to associate the datapoint.
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
ID associated to the parent datapoint in a session.
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
The inputs passed to the prompt template.
Identifies where the model was called from.
Any additional metadata to record.
Whether the request/response payloads will be stored on Humanloop.
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
The number of generations.
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
End-user ID passed through to provider call.
Deprecated field: the seed is instead set as part of the request.config object.
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
Include the log probabilities of the top n tokens in the provider_response
The suffix that comes after a completion of inserted text. Useful for completions that act like inserts.
/completion
POST
Create a completion using the project's active deployment.
The active deployment can be a specific model configuration or an experiment.
const createDeployedResponse = await humanloop.completeDeployed({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
});
Unique project name. If no project exists with this name, a new project will be created.
Unique ID of a project to associate to the log. Either this or project
must be provided.
ID of the session to associate the datapoint.
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
ID associated to the parent datapoint in a session.
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
The inputs passed to the prompt template.
Identifies where the model was called from.
Any additional metadata to record.
Whether the request/response payloads will be stored on Humanloop.
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
The number of generations.
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
End-user ID passed through to provider call.
Deprecated field: the seed is instead set as part of the request.config object.
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
Include the log probabilities of the top n tokens in the provider_response
The suffix that comes after a completion of inserted text. Useful for completions that act like inserts.
The environment name used to create a chat response. If not specified, the default environment will be used.
/completion-deployed
POST
Create a completion for a specific experiment.
const createExperimentResponse = await humanloop.completeExperiment({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
experiment_id: "experiment_id_example",
});
If an experiment ID is provided a model configuration will be sampled from the experiments active model configurations.
Unique project name. If no project exists with this name, a new project will be created.
Unique ID of a project to associate to the log. Either this or project
must be provided.
ID of the session to associate the datapoint.
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
ID associated to the parent datapoint in a session.
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
The inputs passed to the prompt template.
Identifies where the model was called from.
Any additional metadata to record.
Whether the request/response payloads will be stored on Humanloop.
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
The number of chat responses, where each chat response will use a model configuration sampled from the experiment.
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
End-user ID passed through to provider call.
Deprecated field: the seed is instead set as part of the request.config object.
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
Include the log probabilities of the top n tokens in the provider_response
The suffix that comes after a completion of inserted text. Useful for completions that act like inserts.
/completion-experiment
POST
Create a completion for a specific model configuration.
const createModelConfigResponse = await humanloop.completeModelConfiguration({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
model_config_id: "model_config_id_example",
});
Identifies the model configuration used to create a chat response.
Unique project name. If no project exists with this name, a new project will be created.
Unique ID of a project to associate to the log. Either this or project
must be provided.
ID of the session to associate the datapoint.
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
ID associated to the parent datapoint in a session.
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
The inputs passed to the prompt template.
Identifies where the model was called from.
Any additional metadata to record.
Whether the request/response payloads will be stored on Humanloop.
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
The number of generations.
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
End-user ID passed through to provider call.
Deprecated field: the seed is instead set as part of the request.config object.
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
Include the log probabilities of the top n tokens in the provider_response
The suffix that comes after a completion of inserted text. Useful for completions that act like inserts.
/completion-model-config
POST
Delete a list of datapoints by their IDs.
WARNING: This endpoint has been decommisioned and no longer works. Please use the v5 datasets API instead.
const deleteResponse = await humanloop.datapoints.delete();
/datapoints
DELETE
Get a datapoint by ID.
const getResponse = await humanloop.datapoints.get({
id: "id_example",
});
String ID of datapoint.
/datapoints/{id}
GET
Edit the input, messages and criteria fields of a datapoint.
WARNING: This endpoint has been decommisioned and no longer works. Please use the v5 datasets API instead.
const updateResponse = await humanloop.datapoints.update({
id: "id_example",
});
String ID of datapoint.
/datapoints/{id}
PATCH
Create a new dataset for a project.
const createResponse = await humanloop.datasets.create({
projectId: "projectId_example",
description: "description_example",
name: "name_example",
});
The description of the dataset.
The name of the dataset.
/projects/{project_id}/datasets
POST
Create a new datapoint for a dataset.
Here in the v4 API, this has the following behaviour:
- Retrieve the current latest version of the dataset.
- Construct a new version of the dataset with the new testcases added.
- Store that latest version as a committed version with an autogenerated commit message and return the new datapoints
const createDatapointResponse = await humanloop.datasets.createDatapoint({
datasetId: "dataset_id_example",
requestBody: {
log_ids: ["log_ids_example"],
},
});
String ID of dataset. Starts with evts_
.
requestBody: DatasetsCreateDatapointRequest
/datasets/{dataset_id}/datapoints
POST
Delete a dataset by ID.
const deleteResponse = await humanloop.datasets.delete({
id: "id_example",
});
String ID of dataset. Starts with evts_
.
/datasets/{id}
DELETE
Get a single dataset by ID.
const getResponse = await humanloop.datasets.get({
id: "id_example",
});
String ID of dataset. Starts with evts_
.
/datasets/{id}
GET
Get all Datasets for an organization.
const listResponse = await humanloop.datasets.list();
/datasets
GET
Get all datasets for a project.
const listAllForProjectResponse = await humanloop.datasets.listAllForProject({
projectId: "projectId_example",
});
/projects/{project_id}/datasets
GET
Get datapoints for a dataset.
const listDatapointsResponse = await humanloop.datasets.listDatapoints({
datasetId: "datasetId_example",
page: 0,
size: 50,
});
String ID of dataset. Starts with evts_
.
PaginatedDataDatapointResponse
/datasets/{dataset_id}/datapoints
GET
Update a testset by ID.
const updateResponse = await humanloop.datasets.update({
id: "id_example",
});
String ID of testset. Starts with evts_
.
The description of the dataset.
The name of the dataset.
/datasets/{id}
PATCH
Add evaluators to an existing evaluation run.
const addEvaluatorsResponse = await humanloop.evaluations.addEvaluators({
id: "id_example",
evaluator_ids: ["evaluator_ids_example"],
});
IDs of evaluators to add to the evaluation run. IDs start with evfn_
String ID of evaluation run. Starts with ev_
.
/evaluations/{id}/evaluators
PATCH
Create an evaluation.
const createResponse = await humanloop.evaluations.create({
projectId: "projectId_example",
config_id: "config_id_example",
evaluator_ids: ["evaluator_ids_example"],
dataset_id: "dataset_id_example",
max_concurrency: 5,
hl_generated: true,
});
ID of the config to evaluate. Starts with config_
.
IDs of evaluators to run on the dataset. IDs start with evfn_
ID of the dataset to use in this evaluation. Starts with evts_
.
String ID of project. Starts with pr_
.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization. Ensure you provide an API key for the provider for the model config you are evaluating, or have one saved to your organization.
The maximum number of concurrent generations to run. A higher value will result in faster completion of the evaluation but may place higher load on your provider rate-limits.
Whether the log generations for this evaluation should be performed by Humanloop. If False
, the log generations should be submitted by the user via the API.
/projects/{project_id}/evaluations
POST
Get evaluation by ID.
const getResponse = await humanloop.evaluations.get({
id: "id_example",
});
String ID of evaluation run. Starts with ev_
.
Whether to include evaluator aggregates in the response.
/evaluations/{id}
GET
Get the evaluations associated with a project.
Sorting and filtering are supported through query params for categorical columns
and the created_at
timestamp.
Sorting is supported for the dataset
, config
, status
and evaluator-{evaluator_id}
columns.
Specify sorting with the sort
query param, with values {column}.{ordering}
.
E.g. ?sort=dataset.asc&sort=status.desc will yield a multi-column sort. First by dataset then by status.
Filtering is supported for the id
, dataset
, config
and status
columns.
Specify filtering with the id_filter
, dataset_filter
, config_filter
and status_filter
query params.
E.g. ?dataset_filter=my_dataset&dataset_filter=my_other_dataset&status_filter=running will only show rows where the dataset is "my_dataset" or "my_other_dataset", and where the status is "running".
An additional date range filter is supported for the created_at
column. Use the start_date
and end_date
query parameters to configure this.
const listResponse = await humanloop.evaluations.list({
projectId: "projectId_example",
size: 50,
page: 0,
});
String ID of project. Starts with pr_
.
A list of evaluation run ids to filter on. Starts with ev_
.
Only return evaluations created after this date.
Only return evaluations created before this date.
PaginatedDataEvaluationResponse
/evaluations
GET
Get all the evaluations associated with your project.
Deprecated: This is a legacy unpaginated endpoint. Use /evaluations
instead, with appropriate
sorting, filtering and pagination options.
const listAllForProjectResponse = await humanloop.evaluations.listAllForProject(
{
projectId: "projectId_example",
}
);
String ID of project. Starts with pr_
.
Whether to include evaluator aggregates in the response.
/projects/{project_id}/evaluations
GET
Get testcases by evaluation ID.
const listDatapointsResponse = await humanloop.evaluations.listDatapoints({
id: "id_example",
page: 1,
size: 10,
});
String ID of evaluation. Starts with ev_
.
Page to fetch. Starts from 1.
Number of evaluation results to retrieve.
PaginatedDataEvaluationDatapointSnapshotResponse
/evaluations/{id}/datapoints
GET
Log an external generation to an evaluation run for a datapoint.
The run must have status 'running'.
const logResponse = await humanloop.evaluations.log({
evaluationId: "evaluationId_example",
datapoint_id: "datapoint_id_example",
log: {
save: true,
},
});
The datapoint for which a log was generated. Must be one of the datapoints in the dataset being evaluated.
log: LogRequest
The log generated for the datapoint.
ID of the evaluation run. Starts with evrun_
.
/evaluations/{evaluation_id}/log
POST
Log an evaluation result to an evaluation run.
The run must have status 'running'. One of result
or error
must be provided.
const resultResponse = await humanloop.evaluations.result({
evaluationId: "evaluationId_example",
log_id: "log_id_example",
evaluator_id: "evaluator_id_example",
});
The log that was evaluated. Must have as its source_datapoint_id
one of the datapoints in the dataset being evaluated.
ID of the evaluator that evaluated the log. Starts with evfn_
. Must be one of the evaluator IDs associated with the evaluation run being logged to.
ID of the evaluation run. Starts with evrun_
.
result: ValueProperty
An error that occurred during evaluation.
/evaluations/{evaluation_id}/result
POST
Update the status of an evaluation run.
Can only be used to update the status of an evaluation run that uses external or human evaluators. The evaluation must currently have status 'running' if swithcing to completed, or it must have status 'completed' if switching back to 'running'.
const updateStatusResponse = await humanloop.evaluations.updateStatus({
id: "id_example",
status: "pending",
});
status: EvaluationStatus
The new status of the evaluation.
String ID of evaluation run. Starts with ev_
.
/evaluations/{id}/status
PATCH
Create an evaluator within your organization.
const createResponse = await humanloop.evaluators.create({
description: "description_example",
name: "name_example",
arguments_type: "target_free",
return_type: "boolean",
type: "python",
});
The description of the evaluator.
The name of the evaluator.
arguments_type: EvaluatorArgumentsType
Whether this evaluator is target-free or target-required.
return_type: EvaluatorReturnTypeEnum
The type of the return value of the evaluator.
type: EvaluatorType
The type of the evaluator.
The code for the evaluator. This code will be executed in a sandboxed environment.
model_config: ModelConfigCompletionRequest
The model configuration used to generate.
/evaluators
POST
Delete an evaluator within your organization.
const deleteResponse = await humanloop.evaluators.delete({
id: "id_example",
});
/evaluators/{id}
DELETE
Get an evaluator within your organization.
const getResponse = await humanloop.evaluators.get({
id: "id_example",
});
/evaluators/{id}
GET
Get all evaluators within your organization.
const listResponse = await humanloop.evaluators.list();
/evaluators
GET
Update an evaluator within your organization.
const updateResponse = await humanloop.evaluators.update({
id: "id_example",
arguments_type: "target_free",
return_type: "boolean",
});
The description of the evaluator.
The name of the evaluator.
arguments_type: EvaluatorArgumentsType
Whether this evaluator is target-free or target-required.
return_type: EvaluatorReturnTypeEnum
The type of the return value of the evaluator.
The code for the evaluator. This code will be executed in a sandboxed environment.
model_config: ModelConfigCompletionRequest
The model configuration used to generate.
/evaluators/{id}
PATCH
Create an experiment for your project.
You can optionally specify IDs of your project's model configs to include in the experiment, along with a set of labels to consider as positive feedback and whether the experiment should be set as active.
const createResponse = await humanloop.experiments.create({
projectId: "projectId_example",
name: "name_example",
positive_labels: [
{
type: "type_example",
value: "value_example",
},
],
set_active: false,
});
Name of experiment.
positive_labels: PositiveLabel
[]
Feedback labels to treat as positive user feedback. Used to monitor the performance of model configs in the experiment.
String ID of project. Starts with pr_
.
Configs to add to this experiment. Further configs can be added later.
Whether to set the created project as the project\'s active experiment.
/projects/{project_id}/experiments
POST
Delete the experiment with the specified ID.
const deleteResponse = await humanloop.experiments.delete({
experimentId: "experimentId_example",
});
String ID of experiment. Starts with exp_
.
/experiments/{experiment_id}
DELETE
Get an array of experiments associated to your project.
const listResponse = await humanloop.experiments.list({
projectId: "projectId_example",
});
String ID of project. Starts with pr_
.
/projects/{project_id}/experiments
GET
Samples a model config from the experiment's active model configs.
const sampleResponse = await humanloop.experiments.sample({
experimentId: "experimentId_example",
});
String ID of experiment. Starts with exp_
.
/experiments/{experiment_id}/model-config
GET
Update your experiment, including registering and de-registering model configs.
const updateResponse = await humanloop.experiments.update({
experimentId: "experimentId_example",
});
String ID of experiment. Starts with exp_
.
Name of experiment.
positive_labels: PositiveLabel
[]
Feedback labels to treat as positive user feedback. Used to monitor the performance of model configs in the experiment.
Model configs to add to this experiment.
Model configs in this experiment to be deactivated.
/experiments/{experiment_id}
PATCH
Submit an array of feedback for existing data_ids
const feedbackResponse = await humanloop.feedback({
type: "string_example",
});
type: FeedbackTypeProperty
The feedback value to be set. This field should be left blank when unsetting \'rating\', \'correction\' or \'comment\', but is required otherwise.
ID to associate the feedback to a previously logged datapoint.
A unique identifier to who provided the feedback.
User defined timestamp for when the feedback was created.
If true, the value for this feedback is unset.
/feedback
POST
Delete
const deleteResponse = await humanloop.logs.delete({});
/logs
DELETE
Retrieve a log by log id.
const getResponse = await humanloop.logs.get({
id: "id_example",
});
String ID of log to return. Starts with data_
.
/logs/{id}
GET
Retrieve paginated logs from the server.
Sorting and filtering are supported through query params.
Sorting is supported for the source
, model
, timestamp
, and feedback-{output_name}
columns.
Specify sorting with the sort
query param, with values {column}.{ordering}
.
E.g. ?sort=source.asc&sort=model.desc will yield a multi-column sort. First by source then by model.
Filtering is supported for the source
, model
, feedback-{output_name}
,
evaluator-{evaluator_external_id}
columns.
Specify filtering with the source_filter
, model_filter
, feedback-{output.name}_filter
and
evaluator-{evaluator_external_id}_filter
query params.
E.g. ?source_filter=AI&source_filter=user_1234&feedback-explicit_filter=good will only show rows where the source is "AI" or "user_1234", and where the latest feedback for the "explicit" output group is "good".
An additional date range filter is supported for the Timestamp
column (i.e. Log.created_at
).
These are supported through the start_date
and end_date
query parameters.
Searching is supported for the model inputs and output.
Specify a search term with the search
query param.
E.g. ?search=hello%20there will cause a case-insensitive search across model inputs and output.
const listResponse = await humanloop.logs.list({
projectId: "projectId_example",
versionStatus: "uncommitted",
size: 50,
page: 0,
});
versionStatus: VersionStatus
/logs
GET
Log a datapoint or array of datapoints to your Humanloop project.
const logResponse = await humanloop.log({
save: true,
});
Unique project name. If no project exists with this name, a new project will be created.
Unique ID of a project to associate to the log. Either this or project
must be provided.
ID of the session to associate the datapoint.
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
ID associated to the parent datapoint in a session.
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
The inputs passed to the prompt template.
Identifies where the model was called from.
Any additional metadata to record.
Whether the request/response payloads will be stored on Humanloop.
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
A unique string to reference the datapoint. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a subsequent log request.
Unique ID of an experiment trial to associate to the log.
messages: ChatMessageWithToolCall
[]
The messages passed to the to provider chat endpoint.
Generated output from your model for the provided inputs. Can be None
if logging an error, or if logging a parent datapoint with the intention to populate it later
Unique ID of a config to associate to the log.
config: ConfigProperty
The environment name used to create the log.
feedback: FeedbackLabelsProperty
User defined timestamp for when the log was created.
Error message if the log is an error.
Duration of the logged event in seconds.
output_message: ChatMessageWithToolCall
The message returned by the provider.
Number of tokens in the prompt used to generate the output.
Number of tokens in the output generated by the model.
Cost in dollars associated to the tokens in the prompt.
Cost in dollars associated to the tokens in the output.
Raw request sent to provider.
Raw response received the provider.
/logs
POST
Update a logged datapoint in your Humanloop project.
const updateResponse = await humanloop.logs.update({
id: "id_example",
});
String ID of logged datapoint to return. Starts with data_
.
Generated output from your model for the provided inputs.
Error message if the log is an error.
Duration of the logged event in seconds.
/logs/{id}
PATCH
Update a logged datapoint by its reference ID.
The reference_id
query parameter must be provided, and refers to the
reference_id
of a previously-logged datapoint.
const updateByRefResponse = await humanloop.logs.updateByRef({
referenceId: "referenceId_example",
});
A unique string to reference the datapoint. Identifies the logged datapoint created with the same reference_id
.
Generated output from your model for the provided inputs.
Error message if the log is an error.
Duration of the logged event in seconds.
/logs
PATCH
Deserialize a model config from a .prompt file format.
const deserializeResponse = await humanloop.modelConfigs.deserialize({
config: "config_example",
});
/model-configs/deserialize
POST
Export a model config to a .prompt file by ID.
const exportResponse = await humanloop.modelConfigs.export({
id: "id_example",
});
String ID of the model config. Starts with config_
.
/model-configs/{id}/export
POST
Get a specific model config by ID.
const getResponse = await humanloop.modelConfigs.get({
id: "id_example",
});
String ID of the model config. Starts with config_
.
/model-configs/{id}
GET
Register a model config to a project and optionally add it to an experiment.
If the project name provided does not exist, a new project will be created automatically.
If an experiment name is provided, the specified experiment must already exist. Otherwise, an error will be raised.
If the model config is the first to be associated to the project, it will be set as the active model config.
const registerResponse = await humanloop.modelConfigs.register({
provider: "openai",
model: "model_example",
max_tokens: -1,
temperature: 1,
top_p: 1,
presence_penalty: 0,
frequency_penalty: 0,
endpoint: "complete",
});
The model instance used. E.g. text-davinci-002.
A description of the model config.
A friendly display name for the model config. If not provided, a name will be generated.
provider: ModelProviders
The company providing the underlying model service.
The maximum number of tokens to generate. Provide max_tokens=-1 to dynamically calculate the maximum number of tokens to generate given the length of the prompt
What sampling temperature to use when making a generation. Higher values means the model will be more creative.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.
stop: StopSequenceSProperty
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the generation so far.
Number between -2.0 and 2.0. Positive values penalize new tokens based on how frequently they appear in the generation so far.
Other parameter values to be passed to the provider call.
If specified, model will make a best effort to sample deterministically, but it is not guaranteed.
response_format: ResponseFormat
The format of the response. Only type json_object is currently supported for chat.
Unique project name. If it does not exist, a new project will be created.
Unique project ID
If specified, the model config will be added to this experiment. Experiments are used for A/B testing and optimizing hyperparameters.
Prompt template that will take your specified inputs to form your final request to the provider model. NB: Input variables within the prompt template should be specified with syntax: {{INPUT_NAME}}.
chat_template: ChatMessageWithToolCall
[]
Messages prepended to the list of messages sent to the provider. These messages that will take your specified inputs to form your final request to the provider model. NB: Input variables within the prompt template should be specified with syntax: {{INPUT_NAME}}.
endpoint: ModelEndpoints
Which of the providers model endpoints to use. For example Complete or Edit.
Make tools available to OpenAIs chat model as functions.
/model-configs
POST
Serialize a model config to a .prompt file format.
const serializeResponse = await humanloop.modelConfigs.serialize({
provider: "openai",
model: "model_example",
max_tokens: -1,
temperature: 1,
top_p: 1,
presence_penalty: 0,
frequency_penalty: 0,
endpoint: "complete",
});
A description of the model config.
A friendly display name for the model config. If not provided, a name will be generated.
provider: ModelProviders
The company providing the underlying model service.
The model instance used. E.g. text-davinci-002.
The maximum number of tokens to generate. Provide max_tokens=-1 to dynamically calculate the maximum number of tokens to generate given the length of the prompt
What sampling temperature to use when making a generation. Higher values means the model will be more creative.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.
stop: StopSequenceSProperty
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the generation so far.
Number between -2.0 and 2.0. Positive values penalize new tokens based on how frequently they appear in the generation so far.
Other parameter values to be passed to the provider call.
If specified, model will make a best effort to sample deterministically, but it is not guaranteed.
response_format: ResponseFormat
The format of the response. Only type json_object is currently supported for chat.
endpoint: ModelEndpoints
The provider model endpoint used.
chat_template: ChatMessageWithToolCall
[]
Messages prepended to the list of messages sent to the provider. These messages that will take your specified inputs to form your final request to the provider model. Input variables within the template should be specified with syntax: {{INPUT_NAME}}.
Make tools available to OpenAIs chat model as functions.
Prompt template that will take your specified inputs to form your final request to the model. Input variables within the prompt template should be specified with syntax: {{INPUT_NAME}}.
/model-configs/serialize
POST
Create a new project.
const createResponse = await humanloop.projects.create({
name: "name_example",
});
Unique project name.
feedback_types: FeedbackTypeRequest
[]
Feedback types to be created.
ID of directory to assign project to. Starts with dir_
. If not provided, the project will be created in the root directory.
/projects
POST
Create Feedback Type
const createFeedbackTypeResponse = await humanloop.projects.createFeedbackType({
id: "id_example",
type: "type_example",
_class: "select",
});
The type of feedback to update.
String ID of project. Starts with pr_
.
values: FeedbackLabelRequest
[]
The feedback values to be available. This field should only be populated when updating a \'select\' or \'multi_select\' feedback class.
class: FeedbackClass
The data type associated to this feedback type; whether it is a \'text\'/\'select\'/\'multi_select\'. This is optional when updating the default feedback types (i.e. when type
is \'rating\', \'action\' or \'issue\').
/projects/{id}/feedback-types
POST
Remove the project's active config, if set.
This has no effect if the project does not have an active model config set.
const deactivateConfigResponse = await humanloop.projects.deactivateConfig({
id: "id_example",
});
String ID of project. Starts with pr_
.
Name for the environment. E.g. 'production'. If not provided, will delete the active config for the default environment.
/projects/{id}/active-config
DELETE
Remove the project's active experiment, if set.
This has no effect if the project does not have an active experiment set.
const deactivateExperimentResponse =
await humanloop.projects.deactivateExperiment({
id: "id_example",
});
String ID of project. Starts with pr_
.
Name for the environment. E.g. 'producton'. If not provided, will return the experiment for the default environment.
/projects/{id}/active-experiment
DELETE
Delete a specific file.
const deleteResponse = await humanloop.projects.delete({
id: "id_example",
});
String ID of project. Starts with pr_
.
/projects/{id}
DELETE
Remove the verion deployed to environment.
This has no effect if the project does not have an active version set.
const deleteDeployedConfigResponse =
await humanloop.projects.deleteDeployedConfig({
projectId: "projectId_example",
environmentId: "environmentId_example",
});
/projects/{project_id}/deployed-config/{environment_id}
DELETE
Deploy a model config to an environment.
If the environment already has a model config deployed, it will be replaced.
const deployConfigResponse = await humanloop.projects.deployConfig({
projectId: "projectId_example",
});
Model config unique identifier generated by Humanloop.
String ID of experiment. Starts with exp_
.
environments: EnvironmentRequest
[]
List of environments to associate with the model config.
EnvironmentProjectConfigResponse
/projects/{project_id}/deploy-config
PATCH
Export all logged datapoints associated to your project.
Results are paginated and sorts the datapoints based on created_at
in
descending order.
const exportResponse = await humanloop.projects.export({
id: "id_example",
page: 0,
size: 10,
});
String ID of project. Starts with pr_
.
Page offset for pagination.
Page size for pagination. Number of logs to export.
/projects/{id}/export
POST
Get a specific project.
const getResponse = await humanloop.projects.get({
id: "id_example",
});
String ID of project. Starts with pr_
.
/projects/{id}
GET
Retrieves a config to use to execute your model.
A config will be selected based on the project's active config/experiment settings.
const getActiveConfigResponse = await humanloop.projects.getActiveConfig({
id: "id_example",
});
String ID of project. Starts with pr_
.
Name for the environment. E.g. 'producton'. If not provided, will return the active config for the default environment.
/projects/{id}/active-config
GET
Get a paginated list of files.
const listResponse = await humanloop.projects.list({
page: 0,
size: 10,
sortBy: "created_at",
order: "asc",
});
Page offset for pagination.
Page size for pagination. Number of projects to fetch.
Case-insensitive filter for project name.
Case-insensitive filter for users in the project. This filter matches against both email address and name of users.
sortBy: ProjectSortBy
Field to sort projects by
order: SortOrder
Direction to sort by.
/projects
GET
Get an array of versions associated to your file.
const listConfigsResponse = await humanloop.projects.listConfigs({
id: "id_example",
});
String ID of project. Starts with pr_
.
/projects/{id}/configs
GET
Get an array of environments with the deployed configs associated to your project.
const listDeployedConfigsResponse =
await humanloop.projects.listDeployedConfigs({
id: "id_example",
});
String ID of project. Starts with pr_
.
EnvironmentProjectConfigResponse
/projects/{id}/deployed-configs
GET
Update a specific project.
Set the project's active model config/experiment by passing either
active_experiment_id
or active_model_config_id
.
These will be set to the Default environment unless a list of environments
are also passed in specifically detailing which environments to assign the
active config or experiment.
Set the feedback labels to be treated as positive user feedback used in
calculating top-level project metrics by passing a list of labels in
positive_labels
.
const updateResponse = await humanloop.projects.update({
id: "id_example",
});
String ID of project. Starts with pr_
.
The new unique project name. Caution, if you are using the project name as the unique identifier in your API calls, changing the name will break the calls.
ID for an experiment to set as the project\'s active deployment. Starts with \'exp_\'. At most one of \'active_experiment_id\' and \'active_model_config_id\' can be set.
ID for a config to set as the project\'s active deployment. Starts with \'config_\'. At most one of \'active_experiment_id\' and \'active_config_id\' can be set.
positive_labels: PositiveLabel
[]
The full list of labels to treat as positive user feedback.
ID of directory to assign project to. Starts with dir_
.
/projects/{id}
PATCH
Update feedback types.
Allows enabling the available feedback types and setting status of feedback types/categorical values.
This behaves like an upsert; any feedback categorical values that do not already exist in the project will be created.
const updateFeedbackTypesResponse =
await humanloop.projects.updateFeedbackTypes({
id: "id_example",
requestBody: [
{
type: "type_example",
_class: "select",
},
],
});
String ID of project. Starts with pr_
.
requestBody: FeedbackTypeRequest
[]
/projects/{id}/feedback-types
PATCH
Create a new session.
Returns a session ID that can be used to log datapoints to the session.
const createResponse = await humanloop.sessions.create();
/sessions
POST
Get a session by ID.
const getResponse = await humanloop.sessions.get({
id: "id_example",
});
String ID of session to return. Starts with sesh_
.
/sessions/{id}
GET
Get a page of sessions.
const listResponse = await humanloop.sessions.list({
projectId: "projectId_example",
page: 1,
size: 10,
});
String ID of project to return sessions for. Sessions that contain any datapoints associated to this project will be returned. Starts with pr_
.
Page to fetch. Starts from 1.
Number of sessions to retrieve.
/sessions
GET
This TypeScript package is automatically generated by Konfig