Creating specialized software test data generation tools on the fly with ChatGPT

Martin Nilsson
ChatGPT: Optimizing language models for dialogue

Creating a test data generation tool VS using ChatGPT to generate specific types of tests

When working with software testing I have from time to time coded test data generation tools of different kinds. Some tools have copied test data from one place and pasted to another, some have generated text strings of specific lengths and I like using my own implementation of a counter string generator to test input lengths (a thanks to James Bach for detailing it!).

It can take me an hour or two to create a tool that generates a specific type of test data but often this time is too long compared to the limited use I often have in a specific situation. With ChatGPT I’ve found that I can generate specific test data tools for specific situations faster than I can go and get a cup of coffee from the coffee machine. This tool is, very simplified, a very advanced chat bot that takes a prompt as input and provides and answer, but I can leave it to itself to provide a quick overview:

ChatGPT's definition of itself.

The creation of a first simple test data tool

When exploring the capabilities of ChatGPT I started with creation of a list of common security-test inputs with the following prompt:

Generate test data for a text field that tests for security holes, such as XSS or database attacks

The answer that came back from ChatGPT was the following:

Here are several examples of test data that can be used to test for security holes such as XSS and database attacks in a text field:
1) “”
2) “DROP TABLE users”
3) “‘; DELETE FROM users WHERE 1=1 — “
4) “SELECT * FROM users WHERE username=’admin’ — “
5) “alert(1)//”
6) “>”
7) “”>”
8) “<IMG SRC=javascript:alert(‘XSS’)>”
9) “”
10) “<”
Note: These test cases should not be used in a production environment, as they could cause harm to systems. They are intended for use in a controlled testing environment only.

This was perfect for my need but it required me to copy paste the text. So to avoid wearing out my ctrl, c and v-buttons I prompted ChatGPT to create a Python program:

Write a python program that allows me to choose which of the points above I want to have copied into the clipboard

I then realized I also wanted a UI:

Write a user interface to the program

In totalt I now had spent less then a couple of minutes to get a UI with test data that perfectly suited a temporary need:

When clicking on a button the text is copied into the clipboard, ready to be pasted into a text field. The program is 100% written by ChatGPT

Creating a Personas-generator

In testing it can be useful to use personas to look at the product from the perspective of an end-user. For example an end-user might not have the technological skills that the tester have or take for granted. This is how I created a personas generator based on ChatGPT:

First I asked ChatGPT to generate a personas using this prompt:

Provide one set of test data that contains:
A first and last name
Age
Sex
Hobbies
Account balance between 0 and 1000€
Technological skill level from novice to expert

I got the following answer:

John Doe, 37, Male, Tennis, Photography, 650€, Advanced

This looked a good enough answer even though ChatGPT seems to have a preference for the male name “John Doe”. Now I wanted an application that could generate personas on the fly. I entered the following prompt:

Create a python program with a button called “Generate Test Persona”, and a text box. When the user presses the button called “Generate Test Persona” the program shall send the following chat message to Chat GPT3: “Provide one set of test data that contains:
A first and last name
Age
Sex
Hobbies
Account balance between 0 and 1000€
Technological skill level from novice to expert”
The response shall be displayed in the text box.

The output was not perfect and I had to do some updates (the end-point, definition of what model to use and the maximum number of tokens allowed). But once that was done, I now had a tool that can generate random personas on demand. This took me less than a couple of minutes in total!

The code is easy to modify to suit my needs better. For example I made the text box larger to fit all the text. I also decided that the personas needed back stories to make it easier for the tester to put him or her-self in the shoes of the end user. I added to the prompt that I wanted a back story and here are some of the answers I got:

Name: John Smith
Age: 37
Sex: Male
Hobbies: Soccer, Reading, Cooking
Account Balance: 500€
Technological Skill Level: Intermediate
Back Story: John has been using technology since he was a child, but he’s never had a deep understanding of how it works. He’s been able to use computers and phones to do basic tasks like checking his emails, but he’s never been able to do more complicated things like coding. He’s always been interested in technology, but he’s never been able to take the time to learn more about it.
Name: Sophie Green
Age:28
Sex: Female
Hobbies: Traveling, Knitting
Account Balance: 800€
Technological Skill Level: Intermediate
Back Story: Sophie has always had an interest in technology and how it can make her life easier. As a result, she took several courses in computer science and became proficient in using various software and hardware. She’s always been eager to learn and expand her knowledge, but never saw herself as an expert. Instead, she considers herself an intermediate user who is comfortable with technology, but still has much to learn. Despite her busy schedule, she still finds time to knit, which is her favorite hobby, as well as traveling to new and exciting places.

The names I got were typical English or American names but the end users can come from different countries so I decided that I wanted an Italian name (to have a little bit of fun with the Italian team I am currently working with). It took a couple of seconds to add “An Italian first and last name” to the prompt. Oh, and I also wanted a button that takes the user to the House of Test-homepage so I asked for that as well. Now I had a nice result!:

A persona is generated by ChatGPT when the button is pressed

I did struggle a little bit with the randomness of the test data. The first test I did for the personas it randomly created personas that were either novice, intermediate or expert just as I wanted. After adding the back story it seemed to have to have a strong preference for intermediate, after generating ten personas all had that same skill level. I adjusted it by forcing ChatGPT to choose either novie or expert and it started to randomize between the two. There is probably a lot to learn about how to formulate the prompt to get better results but one must also make sure to not trust the randomness provided unless properly tested.

Thoughts about the technology

The technology behind ChatGPT is amazing but not perfect. The biggest upside for me so far has been to quickly get up to speed, I get something that I can work with instead of starting from scratch. But it is important to understand what you are asking for and what you are getting back. If you want a random number it can be important to understand exactly how random that number generation is. And as I experienced; the randomness was already low with the male name John Doe being preferred by ChatGPT based on my prompt.

I strongly believe that it will become increasingly important for every person to read books. And read lot’s of them. If people can’t reason about how correct the returning information is then they will be at risk of being controlled by the AI rather than the AI supporting them. If a person cannot question the output from the AI then that person is doomed to always follow the instructions from the AI since that will be the only information the person can base a decision on. A well informed person can instead take the output from the AI, match it against the persons own knowledge and use that as a starting point to faster navigate to a solution or conclusion.

What ChatGPT thought about this post when asking for feedback

I asked ChatGPT to provide feedback on my writing of this blog post and this is the feedback I got on the first draft:

Feedback on an earlier version of this blog post fro ChatGPT

Based on the feedback I added a little bit of information about ChatGPT at the beginning of this post and I added the paragraph with my thoughts about the technology.

Summary

The time investment to code a specific test data tool from scratch might not be worth it unless I know I am going to use it frequently. But with ChatGPT I can spend the first minutes in an exploratory test session generating the exact test data tool I need to complete the charter. I can potentially work faster and reach a higher test coverage during a test session if I have the right tools at my disposal. The step to connect the automatic data generation to an automatic check is small and it will be interesting to see how this will be used in the future. With this technology it will be important to understand exactly what data you are asking for, and what data (and its characteristics) you get back. If you lack this understanding you will be doomed to do as the AI says instead of using the AI to improve your work.

Test the Personas-generator

If you want to play around with the code that was generated for python you can find it below. What I have changed is the end-point, added “model”, added max_tokens amount and made the text box larger. The rest is untouched and generated by ChatGPT. You need to install Python and the libraries needed (tkinter and requests) and you need to exchange API_KEY for your own that can be generated on openai.com for your account: