neighbor has importable Go packages (e.g., builtin/*
, sdk/*
) and an accompanying
command-line interface for searching, cloning and executing an arbitrary binary
against GitHub projects. Abstractions are in place to make doing the aforementioned
easy and efficient for projects obtained from arbitrary search and retrieval methods
(i.e., not limited to GitHub Search, repositories or Git clone).
The motivation for neighbor is to provide users (e.g., developers, researchers, etc.)
with a way to search, efficiently clone and evaluate projects without having to
"roll their own". Instead users can focus on the task at hand. [TODO] Another motivation
for neighbor is to provide researchers with a standard, reproducible way of obtaining projects.
In order to guarantee fair comparisons of approaches, rather than "hand-picked"
projects that reinforce claims. This can be accomplished now via zip
ping the
project versions retrieved from a run of neighbor.
neighbor uses v3 of GitHub's REST API.
- Extensibility
- Abstract interfaces for projects, search and retrieval functions which means that it is easy to add new "types" or projects (e.g., something other than GitHub repositories) and use other methods for search and retrieval in addition to GitHub search and Git clone, respectively.
- Abstracting GitHub API interaction (searching, sorting and cloning)
- Transparent pagination
- Transparent authentication
- Transparent rate limit handling
- Doing the above efficiently by leveraging Go's concurrent capabilities
- Go
1.13+
- Why
1.13+
?- Updates to error handling
- Updates to modules for dependency management
- Installing Go documentation
- Why
-
Installing the project
GOPROXY=https://proxy.golang.org go get github.com/mccurdyc/neighbor@latest
-
Searching and Evaluating
First, you should review the Searching on GitHub documentation.
- Plain Retrieve Example
make build ./bin/neighbor --query="org:neighbor-projects NOT minikube" --plain_retrieve --projects_directory="_projects_directory" --num_projects=2 --clean=false
- Repository Search Example
make build ./bin/neighbor --query="org:neighbor-projects NOT minikube" --command="ls -al" --projects_directory="_projects_directory" --num_projects=2 --clean=false
- Code Search Example
Note: GitHub requires users to be logged in to search code. Even in public repositories. Refer to the Code search documentation here for building a query. Code searches are searched elastically and are not guaranteed to return exact matches. Searching code for exact matches is currently in beta and only work on very specific repositories, see this section in the documentation
It is critical that you read the above documentation because Code search may not behave as you would expect. For example,
You can't use the following wildcard characters as part of your search query:
. , : ; / \ ` ' " = * ! ? # $ & + ^ | ~ < > ( ) { } [ ]
The search will simply ignore these symbols. Additionally, I have found that using
extension:EXTENSION
is more reliable and accurate thanfilename:FILENAME
.make build ./bin/neighbor --search_type="code" --auth_token="abc123" --query="pkg/errors in:file extension:mod path:/ user:mccurdyc" --command="ls -al"
- Multi-Line Command Example
Multi-line commands work, but pipes (i.e.,
|
) do not. In order to use pipes, you should create a custom binary that handles piping the output from one command to the next (e.g., "How to pipe several comands in Go?" on StackOverflow)make build ./bin/neighbor --search_type="code" --auth_token="abc123" --query="pkg/errors in:file extension:mod path:/ user:mccurdyc" --command="ls \ -al"
-
Confirming
One way to confirm that you obtained the number of projects that you expected is to run the following:
find _external_projects -mindepth 2 -maxdepth 2 | wc -l
Usage: neighbor (--file=<file> | --query=<string> (--command=<string> | --plain_retrieve)) [--auth_token=<github-access-token>] [--search_type=<repository|code>] [--projects_directory=<string>] [--num_projects=<int>] [--clean=<bool> | --plain_retrieve]
-alsologtostderr
log to standard error as well as files
-auth_token string
Your personal GitHub access token. This is required to access private repositories and increases rate limits.
-clean
Delete the projects directory after running the command against each project. (default true)
-command string
The command to execute on each project returned from a search query.
-file string
Absolute filepath to the config file.
-help
Print this help menu.
-log_backtrace_at value
when logging hits line file:N, emit a stack trace
-log_dir string
If non-empty, write log files in this directory
-logtostderr
log to standard error instead of files
-num_projects int
The number of _desired_ projects to obtain. (default 10)
-plain_retrieve
Whether projects should just be retrieved and not evaluated.
-projects_directory string
Where the projects should be stored locally and found for evalutation. (default "_external_projects")
-query string
The search query to execute.
-search_type string
The type of search to perform. (default "project")
-stderrthreshold value
logs at or above this threshold go to stderr
-v value
log level for V logs
-vmodule value
comma-separated list of pattern=N settings for file-filtered logging
Generate a GitHub Personal Access Token neighbor uses token authentication for communicating and authenticating with GitHub. To read more about GitHub's token authentication, visit this site.
You can create a personal access token and use it in place of a password when performing Git operations over HTTPS with Git on the command line or the API.
Authentication is required to both increase the GitHub API limitations as well as access private content (e.g., repositories, gists, etc.).
- Use the
--auth_token
command-line argument - If using a config file, add the generated token to the file
{ "auth_token": "yourAccessToken1234567890abcdefghijklmnopqrstuvwxyz", ... }
neighbor allows you to specify an executable binary to be run on a per-repository basis with each repository as the working directory.
Examples can be found in the examples.