Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Kaiaulu Architcture #283

Merged
merged 1 commit into from
Mar 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ __kaiaulu 0.0.0.9700 (in development)__

### NEW FEATURES

* Kaiaulu architecture has been refactored. Instead of using a parser, download, network module structure, Kaiaulu now uses a combination of data type and tool structure. In that manner, various parser functions of download,R, parser.R, and network.R now are separated in git.R, jira.R, git.R, etc. When only small functionality of a tool is required, functions are grouped based on the data type they are associated to, for example, src.R. Kaiaulu API documentation has been updated accordingly. Functions signature and behavior remain the same: The only modification was the new placement of functions into files. For further rationale and changes, see the issue for more details. [#241](https://github.com/sailuh/kaiaulu/issues/241)
* Temporal bipartite projections are now weighted. The temporal projection can be parameterized by `weight_scheme_cum_temporal()` `weight_scheme_pairwise_cum_temporal()` when all time lag edges are used, or the existing weight schemes can also be used when using a single lag. The all lag weight schemes reproduce the same behavior as Codeface's paper. See the issue for details. [#229](https://github.com/sailuh/kaiaulu/issues/229)
* The `make_jira_issue()` and `make_jira_issue_tracker()` have been added, alongside examples and unit tests for `parse_jira()`. [#228](https://github.com/sailuh/kaiaulu/issues/228)
* We can now generate fake mailing lists `make_mbox_reply`, and `make_mbox_mailing_list` for unit testing and tool comparison [#238](https://github.com/sailuh/kaiaulu/issues/238)
Expand Down
712 changes: 712 additions & 0 deletions R/bugzilla.R

Large diffs are not rendered by default.

451 changes: 0 additions & 451 deletions R/download.R

This file was deleted.

136 changes: 136 additions & 0 deletions R/dv8.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,142 @@
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at https://mozilla.org/MPL/2.0/.

#' Transform parsed dependencies into a structural dsm.json file.
#'
#' Converts table of dependencies from \code{\link{parse_dependencies}} into an *-sdsm.json.
#' In the sdsm.json, the Variables are all files/methods or any variables under analysis
#' (rows/columns in dependency matrix) and the Cells (matrix cell) contain all the relations of
#' variable (src & dest) pairs.
#'
#' @param project_dependencies A parsed depends project by \code{\link{parse_dependencies}}.
#' @param sdsmj_path the path to save the structural dsm (*-sdsm.json).
#' @param is_sorted whether to sort the variables (filenames) in the sdsm.json file (optional).
#' @export
#' @family edgelists
#' @family dv8
#' @seealso \code{\link{parse_dependencies}} to get a table of parsed dependencies needed as input into \code{\link{transform_dependencies_to_sdsmj}},
#' \code{\link{transform_gitlog_to_hdsmj}} to perform a similar transformation into a *-dsm.json using a gitlog,
#' \code{\link{transform_temporal_gitlog_to_adsmj}} to perform a similar transformation into a *-dsm.json using a temporal gitlog,
#' \code{\link{graph_to_dsmj}} to generate a *-dsm.json file.
transform_dependencies_to_sdsmj <- function(project_dependencies, sdsmj_path, is_sorted=FALSE){
# Make copy of table to do changes
project_depends <- copy(project_dependencies)

Check warning on line 26 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L26

Added line #L26 was not covered by tests

# Convert table to long form
project_depends[["edgelist"]] <- melt(project_depends[["edgelist"]],id.vars <- c("src_filepath","dest_filepath"), variable.name = "label")

Check warning on line 29 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L29

Added line #L29 was not covered by tests

setnames(x=project_depends[["nodes"]], old = c("filepath"), new = c("name"))

Check warning on line 31 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L31

Added line #L31 was not covered by tests

setnames(x=project_depends[["edgelist"]], old = c("src_filepath","dest_filepath", "value"),
new = c("from","to", "weight"))

Check warning on line 34 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L33-L34

Added lines #L33 - L34 were not covered by tests

# Put the weight column in front of the label column
setcolorder(project_depends[["edgelist"]], c("from", "to", "weight", "label"))

Check warning on line 37 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L37

Added line #L37 was not covered by tests

# This is a directed graph, so no duplication of edges
graph_to_dsmj(project_depends, sdsmj_path, dsmj_name="sdsm", is_directed=TRUE, is_sorted)

Check warning on line 40 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L40

Added line #L40 was not covered by tests
}

#' Transform parsed git repo into a history dsm.json file.
#'
#' Converts a gitlog table into an *-hdsm.json.
#' In the hdsm.json, the Variables are all files/methods or any variables under analysis
#' (rows/columns in dependency matrix) and the Cells (matrix cell) contain all the relations of
#' variable (src & dest) pairs. The Co-change is the number of times the src & dest were committed together.
#' Note that the co-change between a file and its renamed variant will not be considered
#' using this function, so those cells won't appear in the final *-hdsm.json.
#'
#' @param project_git A parsed git project by \code{\link{parse_gitlog}}.
#' @param hdsmj_path the path to save the history dsm (*-hdsm.json).
#' @param is_sorted whether to sort the variables (filenames) in the hdsm.json file (optional).
#' @export
#' @family edgelists
#' @family dv8
#' @seealso \code{\link{parse_gitlog}} to get a table of a parsed git project needed as input into \code{\link{transform_gitlog_to_hdsmj}},
#' \code{\link{transform_temporal_gitlog_to_adsmj}} to perform a similar transformation into a *-dsm.json using a temporal gitlog,
#' \code{\link{transform_dependencies_to_sdsmj}} to perform a similar transformation into a *-dsm.json using dependencies from Depends,
#' \code{\link{graph_to_dsmj}} to generate a *-dsm.json file.
transform_gitlog_to_hdsmj <- function(project_git, hdsmj_path, is_sorted=FALSE){
# Call preliminary functions to get graph and cochange for the files
git_bipartite <- transform_gitlog_to_bipartite_network(project_git, mode ="commit-file")
cochange_table <- bipartite_graph_projection(git_bipartite, mode = FALSE,
weight_scheme_function = weight_scheme_count_deleted_nodes)

Check warning on line 66 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L64-L66

Added lines #L64 - L66 were not covered by tests

# Add label column with Cochange value
cochange_table[["edgelist"]][["label"]] <- "Cochange"

Check warning on line 69 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L69

Added line #L69 was not covered by tests

# This is an undirected graph, so there is duplication of edges
graph_to_dsmj(cochange_table, hdsmj_path, dsmj_name="hdsm", is_directed=FALSE, is_sorted)

Check warning on line 72 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L72

Added line #L72 was not covered by tests
}

#' Transform parsed git repo into an author dsm.json file.
#'
#' Converts a temporal gitlog table into an *-adsm.json.
#' In the adsm.json, the Variables are all the authors under analysis
#' (rows/columns in dependency matrix) and the Cells (matrix cell) contain all the relations of
#' variable (src & dest) pairs. The Collaborate value is the number of times the src author and dest author changed the same file.
#'
#' @param project_git A parsed git project by \code{\link{parse_gitlog}}.
#' @param adsmj_path the path to save the author dsm (*-adsm.json).
#' @param is_sorted whether to sort the variables (filenames) in the adsm.json file (optional).
#' @export
#' @family edgelists
#' @family dv8
#' @seealso \code{\link{parse_gitlog}} to get a table of a parsed git project needed as input into \code{\link{transform_gitlog_to_hdsmj}},
#' \code{\link{transform_gitlog_to_hdsmj}} to perform a similar transformation into a *-dsm.json using a gitlog,
#' \code{\link{transform_dependencies_to_sdsmj}} to perform a similar transformation into a *-dsm.json using dependencies from Depends,
#' \code{\link{graph_to_dsmj}} to generate a *-dsm.json file.
transform_temporal_gitlog_to_adsmj <- function(project_git, adsmj_path, is_sorted=FALSE){
# Call preliminary functions to get graph and collaborators for the files
author_table <- transform_gitlog_to_temporal_network(project_git, mode=c("author"))

Check warning on line 94 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L94

Added line #L94 was not covered by tests

# Add label column with Collaborate value
author_table[["edgelist"]][["label"]] <- "Collaborate"

Check warning on line 97 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L97

Added line #L97 was not covered by tests

# This is a directed graph, so no duplication of edges
graph_to_dsmj(author_table, adsmj_path, dsmj_name="adsm", is_directed=TRUE, is_sorted)

Check warning on line 100 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L100

Added line #L100 was not covered by tests
}

#' Transform parsed git repo into an edgelist
#'
#' @param project_git A parsed git project by \code{\link{parse_gitlog}}.
#' @param mode The network of interest: author-entity, committer-entity, commit-entity, author-committer
#' @export
#' @family edgelists
transform_gitlog_to_bipartite_network <- function(project_git, mode = c("author-file","committer-file","commit-file",'author-committer')){
author_name_email <- author_datetimetz <- commit_hash <- committer_name_email <- committer_datetimetz <- lines_added <- lines_removed <- NULL # due to NSE notes in R CMD check

Check warning on line 110 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L110

Added line #L110 was not covered by tests
# Check user did not specify a mode that does not exist
mode <- match.arg(mode)

Check warning on line 112 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L112

Added line #L112 was not covered by tests
# Select and rename relevant columns. Key = commit_hash.
project_git <- project_git[,.(author=author_name_email,
author_date=author_datetimetz,
commit_hash=commit_hash,
committer=committer_name_email,
committer_date = committer_datetimetz,
file = file_pathname,
added = lines_added,
removed = lines_removed)]
if(mode == "author-file"){
git_graph <- model_directed_graph(project_git[,.(from=author,to=file)],
is_bipartite=TRUE,
color=c("black","#f4dbb5"))
}else if(mode == "committer-file"){
git_graph <- model_directed_graph(project_git[,.(from=committer,to=file)],
is_bipartite=TRUE,
color=c("#bed7be","#f4dbb5"))
}else if(mode == "commit-file"){
git_graph <- model_directed_graph(project_git[,.(from=commit_hash,to=file)],
is_bipartite=TRUE,
color=c("#afe569","#f4dbb5"))
}else if(mode == "author-committer"){
git_graph <- model_directed_graph(project_git[,.(from=author,to=committer)],
is_bipartite=TRUE,
color=c("black","#bed7be"))

Check warning on line 137 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L114-L137

Added lines #L114 - L137 were not covered by tests
}
return(git_graph)

Check warning on line 139 in R/dv8.R

View check run for this annotation

Codecov / codecov/patch

R/dv8.R#L139

Added line #L139 was not covered by tests

}

#' Transforms a gitlog table to a historical DSM JSON file.
#'
#' Converts a gitlog table into an *-hdsm.json.
Expand Down
Loading
Loading