Designing a tool to help to generate user personas from survey data.
Data Scientists at the Hartree Centre North East Hub worked with Persona Design to develop a proof-of-concept artificial intelligence tool that generates user personas from the survey data that it is provided with.
Persona Design is a startup developing a dynamic persona generation tool, which allows organisations to develop interactive user personas based upon existing market research and customer interactions. In this project we tackled the challenge of processing mixed data type customer survey data to inform persona generation. The goal was to automatically identify market segments within the data set, and generate personas representing each of these grounded on the survey data.
Tentre North East Hub and the project began with the development of prompts for an Azure hosted GPT-4o large language model (LLM) to return user personas in the Persona Design team’s existing persona response format. To incorporate user survey data the team first applied dimensionality reduction techniques and the K-means clustering method, allowing clusters within the numeric portions of the data to be identified, visualised and compared to existing market segmentation tools. The workflow was then developed to handle other data types including free-text survey responses, using the K-Prototypes algorithm to form clusters in a mixed numerical and categorical feature space.
Visualising the survey responses for the generated clusters, the team performed clustering experiments to tune the algorithm parameters and investigate the impact of weighting different questions within the survey data. The team passed representative data from each cluster to the latest Generative AI models to create detailed persona descriptions, ensuring that the generated personas were grounded on the responses Benefits of the real customers surveyed. The final personas were passed to potential end-users who provided a positive evaluation of the outputs.
The code deliverable and web app showcasing the outputs will enable Persona Design to demonstrate the ability to generate accurate user personas using customer survey data. The project developed workflows to handle mixed data types, including numerical, categorical, and free text responses, making the process adaptable to future data sets from other potential users in a variety of markets. Resulting from the project Persona Design are working towards integrating the proof-of-concept clustering code as a feature within the persona generation pipeline of their existing trial platform, ready for further testing with end-users.
"“The team at the Hartree Centre North East Hub were very good at helping us to realise the intricacies of data manipulation and the array of tools on offer. We are much more aware of the data science process in solving these problems. We are going away with not only a better understanding of how we can integrate what the team at the Hartree Centre North East Hub have done, but also where to start when approaching a data problem like this in the future."
- Dan Foster-Smith, Persona Design
This work was completed as part of one of our collaborative data projects. The projects are up to 12 weeks in duration and give you access to a wide range of expertise across our team of data scientists and data engineers. We will work alongside your team to scope your data science or engineering project, build a prototype solution, and explore options to deploy it within your organisation. You can learn more about them on our webpage here.
If you would like to learn more about the Hartree Centre North East Hub or our collaborative data projects, please get in touch with us at: hello@hartreenortheast.uk