This is a project created as part of the Software Development Practice (COMP0104) assignment 2 at University College London (UCL). Its role is to mine repositories (from the Apache Software Project) to see how much TDD is used in practice.
-
Ensure Python 3.10.x or higher is installed
-
Clone project Clone the repository locally, or download and extract the ZIP file.
-
Install prerequisite packages Run the following command from the project root:
pip install -r requirements.txt
To execute the TDD analysis, use the command-line interface:
python tdd_analysis.py [--date DATE] [--language LANGUAGE] [--languages LANGUAGES ...]
[--repository REPOSITORY] [--batch_size BATCH_SIZE] [--force-mine]
[--verbose]
-
--date DATE (optional)
: The date for the experiment in YYYY-MM-DD format. Defaults to a specific date intdd_analysis.py
. -
--language LANGUAGE (optional)
: Single programming language to analyse. Defaults toJava
.Note: If this argument is provided, the
'--languages'
argument will be ignored -
--languages LANGUAGES ... (optional)
: List of programming languages to analyse. Defaults to['Java', 'Python', 'Kotlin', 'C#', 'Rust']
.Note: If the
'--language'
argument is provided, this list is ignored. -
--repository --repo REPOSITORY (optional)
: URL for repository to analyse. By default, analyse all repos underresources/repositories
.Note: The
'--language'
argument must also be provided if'--repository'
is provided. -
--batch_size BATCH_SIZE (optional)
: Batch size for asynchronous repository retrieval using PyDriller. Defaults to 8. -
--force_mine (optional)
: Forcefully mine the repository/repositories, even if they have already been retrieved. Defaults to False. -
--verbose (optional)
: Enable verbose output for debugging or detailed logs.
To display a help message with detailed usage instructions, run:
python tdd_analysis.py --help
To execute the repo finder, use the command-line interface:
python find_repos.py [--github_token TOKEN] [--organisation ORGANISATION] [--language LANGUAGE] [--pagination PAGINATION] [--maximum MAXIMUM]
-
--github_token GITHUB_TOKEN
: Personal GitHub token. For information on how to create one, visit: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#creating-a-fine-grained-personal-access-token -
--language LANGUAGE
: Programming language to find repositories for. -
--organisation --org ORGANISATION (optional)
: Organisation to find repositories for. (default: Apache) -
--pagination PAGINATION (optional)
: Pagination for searching GitHub. Defaults to 100. -
--maximum --max MAXIMUM (optional)
: Maximum number of repositories to find.
To display a help message with detailed usage instructions, run:
python find_repos.py --help
To run the unit tests, simply call:
pytest
This diagram illustrates the entire workflow.