Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fetch #21

Merged
merged 13 commits into from
Oct 27, 2022
Merged

Add fetch #21

merged 13 commits into from
Oct 27, 2022

Conversation

fbe-work
Copy link
Contributor

proposal to add fetch proc and fetch test

@disruptek
Copy link
Owner

Do you want to take a crack at a docstring?

Shouldn't we define a new GIT_FETCH_OPTIONS_VERSION constant?

Also, I don't think we can really assume that the only remote the user wants to fetch is origin; that remote may not even exist. A better approach might be to accept a sequence of GitRemote values and make sure it's trivial to produce such a sequence without foreknowledge of available remote names. What do you think?

@fbe-work
Copy link
Contributor Author

Good points - thank you!
I will add docstring and const GIT_FETCH_OPTIONS_VERSION.
For the remoteName would that not just be an additional fetch proc string argument?

@disruptek
Copy link
Owner

For the remoteName would that not just be an additional fetch proc string argument?

We have a GitRemote type, so I was thinking that we could accept remotes: seq[GitRemote] also. Then we can expand on the GitRemote type a bit going forward to provide such conveniences as "get all the remotes", "get remotes for which this it-expression evaluates as true", and similar.

@fbe-work
Copy link
Contributor Author

I like the idea of "get all the remotes".
So we would need additional procs to get the remotes / remote names.(?):

  • getRemoteNames : seq[string]
  • getRemotes : seq[GitRemote]

But would we not still like to call fetch with no or single remote name?
So there could be fetch proc overloads like this?:

  • repo.fetch(remoteName="origin") fetches origin or given remoteName
  • repo.fetch(remotesNames: seq[string]) fetches given remoteNames
  • repo.fetch(remote: GitRemote) fetches given remote
  • repo.fetch(remotes: seq[GitRemote]) fetches given remotes

@disruptek
Copy link
Owner

I wonder if fetch is descriptive enough -- maybe fetchRemote and fetchRemotes will be easier for the user to understand when they are reading code that uses these functions? What you name these procedures is up to you.

If it's easy enough to transform a sequence of strings into a sequence of GitRemote, perhaps we can add that transformation function and then have fewer overloads and a simpler API? I imagine there will be other uses for GitRemote objects...

@fbe-work
Copy link
Contributor Author

True, would be great to more clearly describe what the fetch proc is doing.
On the other hand, just fetch would be more similar to how other libgit2 wrappers
(e.g. for Python or C#) handle it.
Of course we could also try to serve both approaches:

  • one proc that uses just fetch and looks like in other wrappers, retrieving the found remote:
    • repo.fetch()
  • but also more specific ones, where one can be more precise:
    • repo.fetchRemote(remoteName: string)
    • repo.fetchRemotes(remotesNames: seq[string])

For the remotes: From libgit2 I get a git_str_array which in getRemoteNames
I turn into seq[string], so the string names are straight forward to get.
The GitReference I would probably(?) need to each get individually via remoteLookup,
or do know of a more elegant way there?

@disruptek
Copy link
Owner

I don't know of a more elegant way off the top of my head, but it hardly matters -- what's important is getting the API right. I would skip implementing fetch for now and stick to the more descriptive operations. Copying bad design is not good design. Consider how often this code will be written -- rarely -- and read -- often -- and consider that the cost of a fetch operation dwarfs the cost of retrieving GitRemote objects given a series of strings...

@fbe-work
Copy link
Contributor Author

Agreed.

@fbe-work
Copy link
Contributor Author

fbe-work commented Oct 4, 2022

Yikes, that is a few of unsuccessful checks.. 😶
@disruptek How do you go about testing for platforms you have no avilable?
(I certainly run the the tests on my local linux machin,e but have no access to the other two. 🤔 )

@disruptek
Copy link
Owner

I reran the tests and it looks like at least one of the memory managers is crashing in the new code; take a look at the ubuntu runs to see the test matrix, etc. You can reproduce that matrix locally using env GITHUB_ACTIONS=true balls in the repository root.

Not sure why GitHub Actions is infinite-looping, but maybe that's a symptom of the same problem.

@fbe-work
Copy link
Contributor Author

fbe-work commented Oct 5, 2022

Thank you!
Right, I saw in some logs an error at: L861dealloc addr remote
Just now realized you already have free for the git_remote,
which works nicely over here. I am curious if it that solves it.

Indeed, I shoud have a look into getting gh actions running on my fork as well.

This was linked to issues Oct 12, 2022
@disruptek
Copy link
Owner

Nice, this is getting really close! 👏

@fbe-work
Copy link
Contributor Author

yes 🙂 - I thought I better focus on the single fetch first (and keep out the multi fetch for now),
until I can - hopefully - figure out how to get the multiple fetch working with these differen gcs.
strange though, that actully the version closest to my system (linux+nim1.6) is the one failing.. 🤔

@disruptek
Copy link
Owner

You might want to look at the generated C that the Nim compiler produces and see if you can track down how gc:orc is crashing. I would build it with --debuginfo --debugger:native --embedsrc:on --linetrace:on and then run it under gdb and compare what the debugger tells you to what you find in the C source you positioned with the --nimcache option.

@fbe-work
Copy link
Contributor Author

Thank you so much for all the helpful advise! 🙂 I will have at look at these.
I was already able to reproduce some memory errors, by switching gc -
hopefully with the help of your debug flags I can find out more.

What is a little worrysome to me though, is that I discovered even on some
machines tagged as successful check, there were memory errors on fetch.. 😟

@disruptek
Copy link
Owner

I am using the gitnim distribution of Nim 1.6.7 from https://gitnim.com/ which uses the Nim nightlies binaries and a patched arc implementation which has atomic reference counting in builds that feature threads:on. That's the only difference from the standard Nim distribution.

I'm in your branch running the tests with env GITHUB_ACTIONS=true balls which runs with all the different memory management options, etc. I ran the tests seven times and only the --gc:markAndSweep --define:danger build of the tests failed, and only once. When it failed, this was the command-line balls used to trigger the failure:

nim c --gc:markAndSweep --path="." --parallelBuild:1 --incremental:off --define:danger --panics:on --experimental:strictFuncs --nimCache:/run/user/1000/balls-nimcache-c.danger.markAndSweep-16406 --out:"test_16410_-1110499242430736409" --outdir:"/run/user/1000/balls-nimcache-c.danger.markAndSweep-16406" --run --exceptions:goto  --hint[Cc]=off --hint[Link]=off --hint[Conf]=off --hint[Processing]=off --hint[Exec]=off --hint[Name]=off --hint[XDeclaredButNotUsed]=off --hint[Performance]=off --warning[UnusedImport]=off --warning[ProveInit]=off --warning[CaseTransition]=off --warning[ObservableStores]=off --warning[UnreachableCode]=off tests/test.nim

So, I dunno, maybe we should consider this a bug in Nim. I don't mind merging the PR if you can reproduce these results.

@fbe-work
Copy link
Contributor Author

I see, so that is why there are odd version numbers. 🙂
Oh nice there is a gc-auto-switcher? That is neat!

I went through the gcs manually [on nim 1.6.8] and the only gc where I could reproduce a:
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
was --gc:boehm.
I was not able to dig deep enough though, to understand what is happening there.

When I run your command (~10x times), the tests always passed
and the output of fetch usually looks like this, which I find ok:

## fetchRemote
🟢 test do:
🟢   cloned := cloneme.clone(tmpdir)
🟢   check GIT_REPOSITORY_STATE_NONE == cloned.repositoryState
🟢   check cloned.fetchRemote("origin") == GIT_OK
##    11                         +480b               6.5 s

but every now and then I get this kind of strange outlier which cannot be right ~ -90tb? : 😨

## fetchRemote
🟢 test do:
🟢   cloned := cloneme.clone(tmpdir)
🟢   check GIT_REPOSITORY_STATE_NONE == cloned.repositoryState
🟢   check cloned.fetchRemote("origin") == GIT_OK
##    11                      -90.78tb               5.5 s

@disruptek
Copy link
Owner

Let's merge it and see if the tests stabilize going forward. 🤷
In any case, good job with your contribution; thanks very much! Looking forward to getting this into Nimph. 😉

@disruptek disruptek merged commit 3ab3c4e into disruptek:master Oct 27, 2022
@fbe-work
Copy link
Contributor Author

Thank you, too!! 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pull or fetch available? nimph wants fetch procedures
2 participants