Applying Machine Learning Techniques in Software Engineering
Most large scale software development projects follow an incremental build model where developers iteratively design, test and implement features. In an Agile environment requirements are broken down into features and a measure of complexity like story points is assigned to them. Developers are then assigned these development tasks based on their complexity. A software project hence can be considered as a group of software features with varied complexity, implemented by various developers.
Project Managers do not have a good way to compare these features in terms of complexity. It is usually based on the project manager’s intuition. A tangible way, to gauge the effort put in by a developer to implement a feature, is the complexity of the code change set.
This is an attempt to use Machine Learning to categorize software feature implementations based on code change sets. The goal here is to cluster feature implementations, by extracting metrics from the code change sets of each implemented feature using a clustering algorithm. This clustering gives the Project Manager a tool to compare in-progress features with historical ones. The Project Manager can adjust timelines, risk categorization, testing strategies based on this.
Complexity comparision is crucial in software engineering, for tasks like determining developer productivity and better project planning.
Broad Academic Area of Work: Machine Learning and Software Engineering
Key words: Clustering, K-Means, Code Change Sets, Code Analysis
Docker image for machine learning enthusiasts
MachineLearningUltimate is a docker image with a bunch of popular machine learning toolkits preinstalled. You can quickly spin up an instance of this docker image on Linux or Windows (using Docker ToolBox)
The goal of MachineLearningUlitmate is to have all popular machine learning / statistical computing toolkits ready for use.
Other projects on GitHub
BeautifulColors - A .NET Core Library for working with colors.
Bencode2Json - A .NET Core Library for converting Bencoded Dictionaries to Json Documents.