summaryrefslogtreecommitdiff
path: root/content
diff options
context:
space:
mode:
Diffstat (limited to 'content')
-rw-r--r--content/development/debcargo_replace_subprocess_git2.rst92
1 files changed, 92 insertions, 0 deletions
diff --git a/content/development/debcargo_replace_subprocess_git2.rst b/content/development/debcargo_replace_subprocess_git2.rst
new file mode 100644
index 0000000..214c041
--- /dev/null
+++ b/content/development/debcargo_replace_subprocess_git2.rst
@@ -0,0 +1,92 @@
+debcargo: Replacing subprocess crate with git2 crate
+####################################################
+
+:date: 2017-07-15 15:35 +0530
+:slug: using-git2-debcargo
+:tags: rust, subprocess, git
+:author: copyninja
+:summary: Replace shell calls in debcargo to extract git information with git2
+ library crate.
+
+In my previous `post <https://copyninja.info/blog/shell-pipelines-rust.html>`_ I
+talked about using *subprocess* crate to extract beginning and ending year from
+git repository for generating debian/copyright file. In this post I'm going to
+talk on how I replaced *subprocess* with native *git2* crate and achieved the
+same result in much cleaner and safer way.
+
+git2 is a native Rust crate which provides access to Git repository internals.
+git2 does not involve any *unsafe* invocation as it is built against
+*libgit2-sys* which is actually using Rust FFI to directly bind to underlying
+libgit library. Below is the new *copyright_fromgit* function with git2 crate.
+
+.. code-block:: rust
+
+ fn copyright_fromgit(repo_url: &str) -> Result<String> {
+ let tempdir = TempDir::new_in(".", "debcargo")?;
+ let repo = Repository::clone(repo_url, tempdir.path())?;
+
+ let mut revwalker = repo.revwalk()?;
+ revwalker.push_head()?;
+
+ // Get the latest and first commit id. This is bit ugly
+ let latest_id = revwalker.next().unwrap()?;
+ let first_id = revwalker.last().unwrap()?; // revwalker ends here is consumed by last
+
+ let first_commit = repo.find_commit(first_id)?;
+ let latest_commit = repo.find_commit(latest_id)?;
+
+ let first_year =
+ DateTime::<Utc>::from_utc(
+ NaiveDateTime::from_timestamp(first_commit.time().seconds(), 0),
+ Utc).year();
+
+ let latest_year =
+ DateTime::<Utc>::from_utc(
+ NaiveDateTime::from_timestamp(latest_commit.time().seconds(), 0),
+ Utc).year();
+
+ let notice = match first_year.cmp(&latest_year) {
+ Ordering::Equal => format!("{}", first_year),
+ _ => format!("{}-{},", first_year, latest_year),
+ };
+
+ Ok(notice)
+ }
+
+So here is what I'm doing
+ 1. Use `git2::Repository::clone` to clone the given URL. We are thus avoiding
+ exec of *git clone* command.
+
+ 2. Get a revision walker object. `git2::RevWalk` implements `Iterator` trait
+ and allows walking through the git history. This is what we are using to
+ avoid exec of *git log* command.
+
+ 3. `revwalker.push_head()` is important because we want to tell revwalker from
+ where we want to walk the history. In this case we are asking it to walk
+ history from repository HEAD. Without this line next line will not work.
+ (Learned it in hard way :-) ).
+
+ 4. Then we extract `git2::Oid` which is we can say similar to commit hash and
+ can be used to lookup a particular commit. We take latest commit hash using
+ `RevWalk::next` call and the first commit using `Revwalk::last`, note the
+ order this is because `Revwalk::last` consumes the revwalker so doing it in
+ reverse order will make borrow checker unhappy :-). This replaces exec of
+ `head -n1` command.
+
+ 5. Look up the `git2::Commit` objects using `git2::Repository::find_commit`
+
+ 6. Then convert the `git2::Time` to `chrono::DateTime` and take out the years.
+
+
+After this change I found an obvious error which went unnoticed in previous
+version, that is if there was no *repository* key in Cargo.toml. When there was
+no repo URL *git clone* exec did not error out and our shell commands happily
+extracted year from the *debcargo* repository!. Well since I was testing code
+from *debcargo* repository It never failed, but when I executed from non-git
+repository folder git threw an error but that was *git log* and not *git
+clone*. This error was spotted right away because git2 threw me an error that I
+gave it empty URL.
+
+When it comes to performance I see that debcargo is faster compared to previous
+version. This makes sense because previously it was doing 5 fork and exec system
+calls and now that is avoided.