Faq’s – Promptkey

Frequently Asked Questions

What is PromptKey?

PromptKey is a comprehensive platform for evaluating large language models (LLMs) across any provider. It allows user-centric evaluation with custom grading parameters tailored to specific workloads and offers powerful tools for managing prompts, datasets, and model performance analysis.

Who can use PromptKey?

PromptKey is designed for teams, businesses, and researchers who need to evaluate and optimize LLM prompts and responses. Users can invite subject matter experts to grade responses and leverage LLMs as judges to enhance evaluation consistency.

How does PromptKey evaluate LLMs?

PromptKey supports user grading with customizable grading parameters. It also enables expert reviews and uses LLM-based evaluation informed by human feedback to ensure high-quality and objective assessments.

What are the steps involved in the evaluation process?

Identify datasets, models, and presets: Users select the datasets, models, and any existing presets for the evaluation. Custom grading parameters specific to the workload are configured.
Understand presets: A preset is a model configuration that includes parameters like max_tokens, temperature, top_p, and top_k. These settings control the behavior and output quality of the model. Each model can have up to three presets, offering flexibility to test different configurations.
Kick off the evaluation: Once everything is set, the user starts the evaluation process.
Generate and store LLM responses: The system generates LLM responses based on the provided prompts and datasets, and the results are securely stored.
User and expert grading: Users can review and grade the responses. Subject matter experts (SMEs) can also be invited to provide their evaluations.
Full evaluation with LLM as a judge: Once human grading is complete, users can trigger a full evaluation where the LLM acts as a judge, using the human feedback to evaluate other records and ensure consistency.

Can I revisit and refine an evaluation?

Yes, evaluations can be revisited, and additional grading or evaluations can be performed as needed to refine results and improve accuracy.

How do I sign up for PromptKey?

Users can sign up using their Google or GitHub accounts for quick and secure access.

What are projects in PromptKey?

Projects are the core organizational structure in PromptKey. Prompts and datasets are part of projects, making it easy to manage and evaluate different workloads in an organized way.

Can I store and version my prompts?

Yes, PromptKey allows you to store prompts with variables and create multiple versions, making it easy to iterate and refine over time.

How do I manage datasets for my prompts?

Datasets for the variables can be uploaded in JSON or CSV format, and you can also create and manage different dataset versions.

How are datasets organized in PromptKey?

Datasets are generally created under a project. When creating a prompt, users can either select an existing dataset across projects or add a new dataset. Any newly added dataset will automatically appear in the datasets tab for that project.

Can I view datasets across multiple projects?

Yes, all datasets from all projects can be viewed at the top level or organization level. This makes it easy to manage and access datasets without being restricted to a single project view.

Can datasets be versioned?

Absolutely! PromptKey supports dataset versioning, allowing you to maintain multiple iterations of your datasets and revert or compare versions as needed.

How does PromptKey handle API keys?

Users can securely store their API keys for different LLM providers. These keys are used exclusively for generating LLM responses within the platform, and robust security measures are in place to protect them.

Can I store multiple API keys for the same Provider?

Today, we allow storing only one API key per provider. In future we will let you use multiple keys.

Can I use different API keys for different providers?

Yes, you can store and manage API keys for multiple LLM providers. The system will automatically use the appropriate key based on your evaluation setup.

Where can I view all my api keys ?

All API keys added are viewable in the configuration tab at the top level organization.

How are my API keys protected?

Your API keys are encrypted before being stored in our database. We use industry-standard encryption protocols to ensure your keys cannot be accessed in plain text form.

What encryption methods do you use to secure API keys?

We use strong encryption algorithms to protect your API keys. The encryption keys themselves are securely stored in a dedicated Key Management Service (KMS) vault, separated from the encrypted data.

When are my API keys decrypted?

Your API keys are only decrypted at the moment they’re needed to execute an evaluation job. After the job completes, the decrypted keys are immediately removed from memory.

Who has access to my API keys?

No one on our team has direct access to your plain text API keys. The system architecture is designed to maintain key confidentiality throughout the entire process.

Can I delete my API keys from your system?

Yes, you can remove your API keys at any time through the configuration tab. Once deleted, the encrypted keys are permanently removed from our database.

What happens if there's a security breach?

In the unlikely event of a security incident, attackers would only have access to encrypted API keys, not the actual keys themselves, as the decryption keys are stored separately in the KMS vault.

Do you use my API keys for anything besides my evaluations?

No. Your API keys are strictly used only for the evaluation jobs you initiate. We never use your keys for any other purpose.

Is there a difference in security between different provider keys?

No, all provider API keys are treated with the same high level of security, regardless of which LLM provider they belong to.

How is pricing determined on PromptKey?

Pricing is job-specific and depends on the size of the tokens processed. This ensures you only pay for the resources you use.

Do I know the cost before running the accuracy evaluation?

Yes, Prompt key will provide an approximate cost based on the size of the dataset size, LLM response size and other available details like grading parameters and number of SME responses. Final cost may slightly vary based on actual LLM tokens, time of the job.

How do I pay for the Evaluation Job?

PromptKey will use a stored credit card and charge for the evaluation job.

What kind of dashboards does PromptKey offer?

PromptKey provides comprehensive dashboards that offer insights into costs, latency, content moderation, token usage, and overall model performance.

Can I track evaluation performance over time?

Yes, PromptKey’s dashboards track historical data, helping you monitor improvements and spot trends in model performance and costs.

Can I invite others to grade LLM responses?

Yes, you can invite subject matter experts to participate in the grading process. Their feedback is incorporated into the evaluation, enhancing the quality and relevance of model assessments.

Where can I view all the SMEs and users I have invited?

At the top level or organization level (navigating click on the top left), you can see the configuration tab where all Users are visible.

How does expert grading work?

Invite experts: After LLM responses are generated and stored, users can invite subject matter experts (SMEs) via email to participate in grading.
Expert access: Experts receive an email invitation and, upon logging in, can see the projects they’ve been invited to.
Select a project: Experts choose the project they want to grade and access the list of LLM-generated responses.
Unbiased grading: Experts see the request made to the LLM and the corresponding response, along with the custom grading parameters defined by the user. However, they will not see which model the response is from, ensuring unbiased ratings.
Grade responses: Experts provide their evaluation for each record and can easily move on to the next one.

Can an SME be invited for multiple projects?

Yes, an SME can be invited to participate in the grading process for multiple projects. They will see a list of all the projects they’ve been invited to when they log in and can select the project they wish to grade.

Will the SME know which model was used?

No, SMEs will not be informed of the model that generated the response. This ensures that their evaluations remain unbiased and focus solely on the quality of the responses.

How does LLM-as-a-judge work?

PromptKey’s LLM-as-a-judge feature uses human feedback as a foundation to assess other records, creating a more scalable and consistent evaluation process.

How does PromptKey's evaluation system work?

Our evaluation system allows you to create custom grading parameters tailored to your specific workload. You can define success criteria, scoring rubrics, and evaluation metrics that matter most to your use case.

Can I customize the grading parameters?

Yes, you can create custom grading parameters specific to your workload. This ensures that evaluations align with your actual business needs and use cases rather than generic metrics.

Who can participate in the evaluation process?

PromptKey supports multiple evaluation approaches:

Internal team evaluations
Subject matter expert invitations
LLM-based automated evaluations that incorporate human feedback
Collaborative grading across teams

Frequently Asked Questions

Reach us now!

Quick Links

Resources