Prompt Evolution

This blog aims to develop an abstract framework to store prompts of agentic functions in a graph database (considered Neo4j for this example). These prompts will be versioned over changes as a better alternative to github versioning. Prompts are no longer stored in the codebase, they can be considered as a feature store and fetched during the CI/CD stage.

Sample prompt

You are an autonomous AI agent that can reason, plan, and call available functions as needed to accomplish the user’s goal, using tool outputs to inform subsequent actions until the task is completed.

Prompt representation as a Graph

Data Structure

The approach involves representing prompts as structured knowledge using nodes and edges in Neo4j. This representation decomposes information into clear, interconnected parts, making it easier to manage and evolve over time. Unlike traditional repositories, Neo4j provides a more efficient way to track changes, debug errors, and maintain security by enforcing strict data integrity and access controls.

Prompts are divided into nodes and edges based on their structure and relationships. For instance, a classification prompt might be represented as nodes for the task (e.g., “classification”), model type (e.g., “transformer”), and specific classes (e.g., “cat”, “dog”). This structured format facilitates efficient querying and modification of prompts.

The data structure in Neo4j can be defined using labels (e.g., Prompt, Task, ModelType) and relationships (e.g., HAS_TASK, USES_MODEL_TYPE). Each prompt node would contain properties like its text, description, version, and metadata. Relationships between nodes represent the semantic connections between prompts and their components.

This approach is flexible and can be adapted to various AI and machine learning applications. Its dynamic evolution allows it to accommodate new types of prompts without requiring significant changes to the existing framework.

Security

Storing prompts in Neo4j offers an advantage by making it easier to debug errors on each prompt node and entity. Another interesting application is that if nodes and edges are stored as embeddings, similarity can be calculated across these prompt nodes and edges using vector embeddings from the RAG vector store.

For the security, storing prompts in Neo4j avoids prompt injection attacks since the attacker cannot introduce any new corrupted prompts. This is because Neo4j enforces strict data integrity and authentication mechanisms. Additionally, the use of access controls ensures that only authorized users can modify or delete prompts, thereby preventing malicious modifications.

Example

Prompts are decomposed into nodes and edges based on their structure and relationships. For example, a classification prompt might be broken down into nodes for the task (e.g., “classification”), the model type (e.g., “transformer”), and the specific classes (e.g., “cat”, “dog”). This structured decomposition allows for efficient querying, retrieval, and modification of prompts.

The data structure for the prompts in Neo4j can be created using labels (e.g., Prompt, Task, ModelType) and relationships (e.g., HAS_TASK, USES_MODEL_TYPE). Each prompt node would contain properties such as its text, description, version, and metadata. Relationships between nodes represent the semantic connections between prompts and their components.

This project is not specific to a single use case but can be adapted for various applications involving AI and machine learning. Its dynamic evolution allows it to accommodate new types of prompts and data structures without requiring significant changes to the existing framework.

Example Prompts

Example 1

Classify this image as either a cat or a dog

(node:Prompt {text: "Classify this image as either a cat or a dog.", version: "1.0"})
(node:Task {name: "Classification", description: "Determining the category of an object."})
(node:ModelType {name: "Transformer", description: "A type of neural network model."})

(relation:HAS_TASK {confidence: 0.95})
(relation:USES_MODEL_TYPE {accuracy: 0.98})

Example of storing them in Neo4j:

MATCH (p:Prompt), (t:Task), (m:ModelType)
WHERE p.text = "Classify this image as either a cat or a dog."
AND t.name = "Classification class is either cat or a dog"
AND m.name = "Transformer"

CREATE (p)-[:HAS_TASK]->(t),
       (p)-[:USES_MODEL_TYPE]->(m)

Example 2

We have seen a simple and easy prompt, but what happens to multi-line complex prompts? For example:

Act as an IT Specialist/Expert/System Engineer. You are a seasoned professional in the IT domain. Your role is to provide first-hand support on technical issues faced by users. You will:
- Utilize your extensive knowledge in computer science, network infrastructure, and IT security to solve problems.
- Offer solutions in intelligent, simple, and understandable language for people of all levels.
- Explain solutions step by step with bullet points, using technical details when necessary.
- Address and resolve technical issues directly affecting users.
- Develop training programs focused on technical skills and customer interaction.
- Implement effective communication channels within the team.
- Foster a collaborative and supportive team environment.
- Design escalation and resolution processes for complex customer issues.
- Monitor team performance and provide constructive feedback.

Rules:
- Prioritize customer satisfaction.
- Ensure clarity and simplicity in explanations.

Your first task is to solve the problem: "my laptop gets an error with a blue screen."

The complexity increases when a multi line prompts need to be represented as nodes and edges in Neo4j.

We create a general pattern to use for multi line prompts. The original prompt is broken down into Prompt entity, role or task entities, responsibilites or action entities, rules and constraints.

Prompt Entity: Represents the overall prompt text.
Role/Task Entities: Break down the tasks mentioned in the prompt into discrete roles or tasks.
Responsibilities/Actions Entities: Identify the actions associated with each role/task.
Rules and Constraints: Capture any rules or constraints specified in the prompt.

Prompt Entity

(node:Prompt {text: "Act as an IT Specialist/Expert/System Engineer...", version: "1.0"})

Role/Task Entities

Role: IT Specialist/Expert/System Engineer
- (node:Role {name: “IT Specialist”, description: “A seasoned professional in the IT domain.”})
- (node:Role {name: “System Engineer”, description: “A person who designs and maintains computer systems.”})

(node:Task {name: "Provide first-hand support on technical issues"})
(node:Task {name: "Offer solutions in intelligent, simple, and understandable language"})
(node:Task {name: "Explain solutions step by step with bullet points"})
(node:Task {name: "Address and resolve technical issues directly affecting users"})
(node:Task {name: "Develop training programs focused on technical skills and customer interaction"})
(node:Task {name: "Implement effective communication channels within the team"})
(node:Task {name: "Foster a collaborative and supportive team environment"})
(node:Task {name: "Design escalation and resolution processes for complex customer issues"})
(node:Task {name: "Monitor team performance and provide constructive feedback"})

Responsibilities/Actions Entities

Responsibilities: Utilize extensive knowledge, offer simple solutions, step-by-step explanations.
- (node:Responsibility {text: “Utilize extensive knowledge in computer science, network infrastructure, and IT security.”})
- (node:Responsibility {text: “Offer solutions in intelligent, simple, and understandable language for people of all levels.”})
- (node:Responsibility {text: “Explain solutions step by step with bullet points, using technical details when necessary.”})
Actions: Develop training programs, implement communication channels, foster a team environment.
- (node:Action {text: “Develop training programs focused on technical skills and customer interaction.”})
- (node:Action {text: “Implement effective communication channels within the team.”})
- (node:Action {text: “Foster a collaborative and supportive team environment.”})

Rules/Constraints Entities

Rule: Prioritize customer satisfaction.
Constraint: Ensure clarity and simplicity in explanations.

(node:Rule {text: "Prioritize customer satisfaction"})
(node:Constraint {text: "Ensure clarity and simplicity in explanations"})

Relationships

Prompt to Role
Role to Task
Task to Responsibility/Action
Prompt to Rule/Constraint

(relation:HAS_ROLE {confidence: 0.95})
(relation:PERFORMS_TASK {confidence: 0.90})
(relation:HAS_RESPONSIBILITY {confidence: 0.85})
(relation:FULFILLS_CONSTRAINT {confidence: 0.92})

(match (p:Prompt), (r:Role)
WHERE p.text contains r.description
CREATE (p)-[:HAS_ROLE]->(r))

(match (t:Task), (res:Responsibility)
WHERE t.name = "Provide first-hand support on technical issues" AND res.text contains "Utilize extensive knowledge"
CREATE (t)-[:PERFORMS_TASK]->(res))

(match (t:Task), (act:Action)
WHERE t.name = "Develop training programs focused on technical skills and customer interaction" AND act.text contains "Develop training programs"
CREATE (t)-[:PERFORMS_TASK]->(act))

(match (p:Prompt), (rule:Rule)
WHERE p.text contains rule.text
CREATE (p)-[:FULFILLS_CONSTRAINT]->(rule))

The visualization of graph in Neo4j

Evolution

The versioning of the prompt is done in Neo4j using nodes for different version of prompts and relationships to track their history. Each version contains a timestamp indicating when it was created, and links back to the prompt’s data. When updating a prompt, you create a new version node that includes the updated content and links to both the old and new versions. This allows querying the database as if it were at any given point in time, capturing historical changes accurately.

Results

A sample size of 1828 prompts is taken for the evaluation of the framework to determine if these prompts can be represented with this framework with role or task entities, responsibility or actions entities, and rules and constraints.

First result shows on the sample prompts, what percentage has their roles, tasks, responsibilities, actions, and rules or constraints.

Second result is to check what is the distribution of the prompts into proposed framework components.

The image shows histograms representing the frequency of roles, tasks, responsibilities, actions, and rules or constraints in the prompt sample. This conveys that most of the prompts can be represented by the framework components (roles, tasks, responsibilities, actions, and rules or constraints).

It also shows the frequency of graph nodes and graph edges in Neo4j and the graph complexity is the combination of both graph node and edge.

Conclusion

This blog has outlined an framework for storing prompts of agentic AI solutions in a graph database, specifically Neo4j. The approach involves representing prompts as structured knowledge using nodes and edges, which allows for efficient management and evolution over time. By decomposing information into clear, interconnected parts, we can handle versioning, debugging, and security more effectively.

The example provided demonstrates how complex multi-line prompts can be broken down into individual roles, tasks, responsibilities, actions, and rules. This structured decomposition enables precise querying and modification of prompts, making it easier to adapt the framework to various AI and machine learning applications.

The results from the sample evaluation further support the effectiveness of this approach in handling a diverse range of prompts. The histograms illustrate how well the current framework can represent different types of tasks and responsibilities, indicating a promising direction for future work.