In this story, we are going to explore a Cobol to Python converter written with Microsoft’s AutoGen framework. AutoGen is a Python-based framework which allows to orchestrate multiple types of agents using different conversation patterns.
About AutoGen
AutoGen has the following types of basic agents:
ConversableAgent — this is the agent with the base functionality and is the base class for all other AutoGen agents. It contains the base functionality to send and receive messages from other agents, to initiate or continue a conversation.
UserProxyAgent — is a proxy agent for humans, soliciting human input as the agent’s reply at each interaction turn by default and also having the capability to execute code and call functions. We will be using this type of agent in our converter to start and terminate the conversation or to execute some code.
AssistantAgent — the agent which interacts with the LLM and typically generates text. It does neither execute code nor interact with the user. We have used this agent to generate Python code, generate unit tests and code reviews.
You can also create a GroupChat which allows you to manage a group of AssistantAgent’s which interact with each other.
Finally AutoGen also allows the creation of your own custom agents, but we did not use this functionality for this converter.
These conversation patterns can be for example:
bi-directional chat between a user proxy and an assistant agent
group chat involving a user proxy and a group of assistant agents
multi-group chat involving multiple groups of agents which interact with each other
For more details, please check the AutoGen’s webpage: https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat
What is the goal of the Cobol Converter?
The goal of this command line tool is to read Cobol files and convert them into Python code with documentation and unit tests. Its secondary goal is to convert the Python code into REST based applications using FastAPI.
High-level workflow
How does the Cobol Converter work?
The Cobol converter uses two types of tools:
AutoGen agents
Python coding tools: Black formatter and Pylint, a static code analyzer
So we have combined AutoGen agents which are backed by an LLM (gpt-4–1106-preview) and also by conventional tools used for checking and formatting code.
There are two AutoGen agent ensembles (teams) at work in this application.
Cobol conversion team
REST conversion team
Agents Teams
Cobol Conversion Team
The Cobol conversion team converts Cobol to Python with documentation and unit tests. It has three agents apart from the user proxy agent which receives the user input:
The Cobol conversion agent — responsible for the Cobol conversion using an LLM
The Unit test agent — used to generate the unit tests from Python code using an LLM
The code reviewer — used to review the converted Python code and also the unit tests
REST Conversion Team
The REST conversion team gets as input the converted Python code and converts it into a REST interface. So if the application was a command line application of some sort, it becomes a REST interface based application using FastAPI.
Full workflow of the Cobol Agent
The Cobol to Python converter workflow consists of a loop which processes each single Cobol file and uses the two agent ensembles and the traditional tools (Black and Pylint).
Here is an annotated version of the workflow:
Full conversion workflow
The Cobol converter workflow has these main stages:
The initial loop processes each Cobol file in a directory.
This is the task block with the Cobol Conversion Team. In this block of tasks the Cobol code is converted, unit tests are created and the code is reviewed. When the Cobol Conversion Team stops, it extracts all relevant blocks with the Python code or text from its agents.
In this block of tasks, the code review is written to disk. The Python code is also formatted and written to disk. The code is also analysed with Pylint and the result of the analysis is written to disk. The unit tests are also executed and its output saved in a file.
This is the task block with the REST conversion team. It has two agents which interact with each other: the REST code converter and the code reviewer. After they have generated the code, the Python code is extracted along with the code review.
In this block the code review is written to disk, the REST interface is actually executed in a process — to see if the code compiles and also runs. The process is then shutdown and formatted and written to disk. Pylint is then used to analyse the code and this analysis is also written to disk.
The converter finishes after all Cobol files have been processed.
The Cobol Converter Output
For a small Cobol file like this one, you should get an output like this one:
Conversion output example
The files are:
rest_critique_write_student_2.txt — the code review for the REST implementation
rest_write_student_2.py — The REST based implementation
rest_write_student_2.py_lint.txt — The result of the static code analysis for the REST implementation
test_write_student_2.py — The unit tests for the converted file
test_write_student_2.py_lint — The static code analysis report for the unit tests of the converted file
test_write_student_2.py_test_output.log — The execution log for test_write_student_2.py
write_student_2.py — The Cobol conversion file
write_student_2.py_lint.txt — The static code analysis for write_student_2.py
If you are interested in the converted files, please check this Google Drive link: https://drive.google.com/drive/u/2/folders/1F7dqo5F2_zDzD8GcLlQFj70ZLdox5SL9
Implementation
The whole code for this command line tool can be found in this repository: https://github.com/onepointconsulting/cobol-converter
Installation, Configuration, Running
The Cobol code converter is a Python 3.11 application which requires Conda to be installed.
The installation instructions can be found in the README of this project.
The configuration of the project relies on an .env file, similar to the .env_local file that you can find in this project.
The Cobol files are read from a directory which should be under the project root folder. This directory is referenced by the SOURCE_CODE_DIR environment variable.
The main entry point for the application is this file: https://github.com/onepointconsulting/cobol-converter/blob/main/cobol_converter/cobol_converter_main.py
This main entry point accepts three arguments that determine how the output files are written: overwrite (overwrites the output files), clear (clears the output files), only_new (only write out files that are not yet translated)
Prompts
We have separated the agent prompts from the code. The prompts for all agents and user proxies are all in this tool file: https://github.com/onepointconsulting/cobol-converter/blob/main/prompts.toml
Here are some prompts used by the Cobol Conversion Team:
[agents]
[agents.python_coder]
system_message = """You are a helpful AI assistant.
You convert Cobol code into Python code. Please do not provide unit tests. Provide instead a main method to run the application.
Also do not omit any code for brevity. We want to see the whole code."""
[agents.python_unit_tester]
system_message = """You are a helpful AI assistant.
You create unit tests based on the unit test library for Python code in the conversation.
Please copy the original Python code that you are testing to your response.
Please make sure to import the unit test library. Provide a main method to run the tests."""
[agents.code_critic]
system_message = """Critic. You are a helpful assistant highly skilled in evaluating the quality of a given code by providing a score from 1 (bad) - 10 (good) while providing clear rationale. YOU MUST CONSIDER CODING BEST PRACTICES for each evaluation. Specifically, you can carefully evaluate the code across the following dimensions
- bugs (bugs): are there bugs, logic errors, syntax error or typos? Are there any reasons why the code may fail to compile? How should it be fixed? If ANY bug exists, the bug score MUST be less than 5.
- Goal compliance (compliance): how well the Cobol code was converted?
- Data encoding (encoding): How good are the unit tests that you can find?
YOU MUST PROVIDE A SCORE for each of the above dimensions.
{bugs: 0, transformation: 0, compliance: 0, type: 0, encoding: 0, aesthetics: 0}
Do not suggest code.
Finally, based on the critique above, suggest a concrete list of actions that the coder should take to improve the code.
If Unit tests are available already and seem OK, reply with TERMINATE"""
Agent Teams Setup
The teams of agents are set up in these two files:
The Cobol conversion team is setup in this file: https://github.com/onepointconsulting/cobol-converter/blob/main/cobol_converter/service/agent_setup.py
The REST Conversion Team is setup in this file: https://github.com/onepointconsulting/cobol-converter/blob/main/cobol_converter/service/agent_rest_setup.py
Main Workflow Implementation
The main implementation of the workflow can be found in this file: https://github.com/onepointconsulting/cobol-converter/blob/main/cobol_converter/service/cobol_conversion_service.py
The link below points to the method which processes every Cobol file and performs the Cobol to Python conversion, Unit test generation, formatting, static code analysis and REST interface creation
For more details about the code please check the code repository.
Takeaways
The Cobol Converter is a first step that can convert Cobol to Python in a short period of time. Smaller simpler programmes might be translated correctly and even executed (e.g. the write_student_2 example). However more complicated programmes (like e.g. the tic tac toe game) were not functionally equivalent when translated to Python (i.e. you could start the program, but the game was unplayable) — even if the output is syntactically correct.
The LLM we used was gpt-4–1106-preview. We have not used any other models. There might be other models which are better at converting Cobol to Python.
Ways to improve the conversion
We can think of several ways to improve the conversion:
a fine-tuned LLM, specialized on the conversion of your Cobol dialect. This would be extremely important to have, especially if you intend to do many conversions.
establish human feedback into the script generation process. The human feedback team should consist or at least one developer that understand Cobol and the target language well. No matter how good an LLM is these days, they still might generate code in unexpected or ways that are not aligned with your business goals. It would be very valuable if human developer feedback can be fed into the code generation process.
If you opt for fine-tuning a model you will eventually be able to refine your model over time by using the result of any successful conversion as new data to fine-tune your LLM.