Merge branch 'updates-articles' into 'master'
Updates articles See merge request tschwery/blog-hugo!12
This commit is contained in:
commit
641e14150a
11 changed files with 1277 additions and 29 deletions
780
articles/2017-05-12-subversion-migration.md
Normal file
780
articles/2017-05-12-subversion-migration.md
Normal file
|
@ -0,0 +1,780 @@
|
|||
---
|
||||
title: Subversion migration to Git
|
||||
date: 2017-05-12 18:30:00
|
||||
---
|
||||
|
||||
Some time ago I was tasked with migrating our Subversion repositories to Git. This article was only written
|
||||
recently because, well, I had forgotten about the notes I had taken during the migration and only stumbled on
|
||||
them recently.
|
||||
|
||||
Our largest repository was something like 500Go and contained a little more than 50'000 commits. The
|
||||
goal was to recover the svn history into git, keep a much information as possible about the commits and
|
||||
the links between them, keep the branches. During the history, there were a number of periodic database dumps
|
||||
that were committed that now weighted down the repository without serving any purpose. There were also a number
|
||||
of branches that were never used and contained nothing of interest.
|
||||
|
||||
The decision was also taken to split some of the tools into their own repositories instead of keeping them
|
||||
into the same repository, cleaning up the main repository to keep only the main project and related sources.
|
||||
|
||||
## Principles
|
||||
* After some experiments, I decided to use svn2git, a tool used by KDE for their migration. It has the
|
||||
advantage of taking a rule file that allow splitting a repository by the svn path, processing tags and
|
||||
branches and transforming them, ignoring other paths, ...
|
||||
* As the import of such a large repository is slow, I decided to mount a btrfs partition so that each
|
||||
step can be snapshotted, allowing me to test the next step without having any fear of having to start
|
||||
again at the beginning.
|
||||
* Some binary files were added to the svn history and it made sense keeping them. I decided to migrate
|
||||
them to git-lfs to reduce the history size without losing them completely.
|
||||
* A lot of commit messages contain references to other commits, I wanted to process these commit messages
|
||||
and transform the reference to a `r` commit into a git hash so that tools can create a link automatically.
|
||||
|
||||
## Tools
|
||||
The first to retrieve is [svn2git](https://github.com/svn-all-fast-export/svn2git).
|
||||
|
||||
The compilation should be easy. First install the dependencies and compile it.
|
||||
|
||||
```
|
||||
$ git clone https://github.com/svn-all-fast-export/svn2git.git
|
||||
$ sudo apt install libqt4-dev libapr1-dev libsvn-dev
|
||||
$ qmake .
|
||||
$ make
|
||||
```
|
||||
|
||||
Once the tool is compiled, we can prepare the btrfs mount in which we will run the migration steps.
|
||||
|
||||
```
|
||||
$ mkdir repositories
|
||||
$ truncate -s 300G repositories.btrfs
|
||||
$ sudo mkfs.btrfs repositories.btrfs
|
||||
$ sudo mount repositories.btrfs repositories
|
||||
$ sudo chown 1000:1000 repositories
|
||||
```
|
||||
|
||||
We will also write a small tool in Go to process the commit messages.
|
||||
|
||||
```
|
||||
sudo apt install golang
|
||||
```
|
||||
|
||||
We will also need `bfg`, a git cleansing tool. You can download the jar
|
||||
file on the [BFG Repo-Cleaner website](https://rtyley.github.io/bfg-repo-cleaner/).
|
||||
|
||||
## First steps
|
||||
The first step of the migration is to retrieve the svn repository itself on the local machine. This is not a
|
||||
checkout of the repository, we need the server folder directly, with the whole history and metadata.
|
||||
|
||||
```
|
||||
rsync -avz --progress sshuser@svn.myserver.com:/srv/svn_myrepository/ .
|
||||
```
|
||||
|
||||
In this case I had SSH access to the server, allowing me to simply rsync the repository. Doing so allowed
|
||||
me to prepare the migration in advance, only copying the new commits on each synchronisation and not the
|
||||
whole repository with its large history. Most of the repository files are never updated so this step is
|
||||
only slow on the first execution.
|
||||
|
||||
### User mapping
|
||||
The first step is to create a mapping file that will map the svn users to git users. A user in svn is a username
|
||||
whereas in git this is a name and email address.
|
||||
|
||||
To get a list of user accounts, we can use the svn command directly on the local repository like this :
|
||||
|
||||
```
|
||||
svn log file:///home/tsc/svn_myrepository \
|
||||
| egrep '^r.*lines?$' \
|
||||
| awk -F'|' '{print $2;}' \
|
||||
| sort \
|
||||
| uniq
|
||||
```
|
||||
|
||||
This will return the list of users in the logs. For each of these users, you should create a line in a mapping
|
||||
file, like so :
|
||||
|
||||
```
|
||||
auser Albert User <albert.user@example.com>
|
||||
aperson Anaelle Personn <anaelle.personn@example.com>
|
||||
```
|
||||
|
||||
This file will be given as input to `svn2git` and should be complete, otherwise the import will fail.
|
||||
|
||||
### Path mapping
|
||||
The second mapping for the svn to git migration of a repository is the svn2git rules. This file will tell
|
||||
the program what will go where. In our case, the repository was not stricly adhering to the svn standard tree,
|
||||
containing a trunk, tags and branches structure as well as some other folders for "out-of-branch" projects.
|
||||
|
||||
```txt
|
||||
# We create the main repository
|
||||
create repository svn_myrepository
|
||||
end repository
|
||||
|
||||
# We create repositories for external tools that will move
|
||||
# to their own repositories
|
||||
create repository aproject
|
||||
end repository
|
||||
create repository bproject
|
||||
end repository
|
||||
create repository cproject
|
||||
end repository
|
||||
|
||||
# We declare a variable to ease the declaration of the
|
||||
# migration rules further down
|
||||
declare PROJECTS=aproject|bproject|cproject
|
||||
|
||||
# We create repositories for out-of-branch folders
|
||||
# that will migrate to their own repositories
|
||||
create repository aoutofbranch
|
||||
end repository
|
||||
create repository boutofbranch
|
||||
end repository
|
||||
|
||||
# We always ignore database dumps wherever there are.
|
||||
# In our case, the database dumps are named "database-dump-20100112"
|
||||
# or forms close to that.
|
||||
match /.*/database([_-][^/]+)?[-_](dump|oracle|mysql)[^/]+
|
||||
end match
|
||||
|
||||
# There are also dumps stored in their own folder
|
||||
match /.*/database/backup(/old)?/.*(.zip|.sql|.lzma)
|
||||
end match
|
||||
|
||||
# At some time the build results were also added to the history, we want
|
||||
# to ignore them
|
||||
match /.*/(build|dist|cache)/
|
||||
end match
|
||||
|
||||
# We process our external tools only on the master branch.
|
||||
# We use the previously declared variable to reduce the repetition
|
||||
# and use the pattern match to move it to the correct repository.
|
||||
match /trunk/(tools/)?(${PROJECTS})/
|
||||
repository \2
|
||||
branch master
|
||||
end match
|
||||
|
||||
# And we ignore them if there are on tags or branches
|
||||
match /.*/(tools/)?${PROJECTS}/
|
||||
end match
|
||||
|
||||
# We start processing our main project after the r10, as the
|
||||
# first commits were missing the trunk and moved the branches, trunk and tags
|
||||
# folders around.
|
||||
match /trunk/
|
||||
min revision 10
|
||||
repository svn_myrepository
|
||||
branch master
|
||||
end match
|
||||
|
||||
# There are branches that are hierarchically organized.
|
||||
# Such cases have to be explicitly configured.
|
||||
match /branches/(old|dev|customers)/([^/]+)/
|
||||
repository svn_myrepository
|
||||
branch \1/\2
|
||||
end match
|
||||
|
||||
# Other branches are as expected directly in the branches folder.
|
||||
match /branches/([^/]+)/
|
||||
repository svn_myrepository
|
||||
branch \1
|
||||
end match
|
||||
|
||||
# The tags were used in a strange fashion before the commit r2500,
|
||||
# so we ignore everything before that refactoring
|
||||
match /tags/([^/]+)/
|
||||
max revision 2500
|
||||
end match
|
||||
|
||||
# After that, we create a branch for each tag as the svn tags
|
||||
# were not used correctly and were committed to. We just name
|
||||
# them differently and will process them afterwards.
|
||||
match /tags/([^/]+)/([^/]+)/
|
||||
min revision 2500
|
||||
repository svn_myrepository
|
||||
branch \1-\2
|
||||
end match
|
||||
|
||||
# Our out-of-branch folder will be processed directly, only creating
|
||||
# a master branch.
|
||||
match /aoutofbranch/
|
||||
repository aoutofbranch
|
||||
branch master
|
||||
end match
|
||||
|
||||
match /boutofbranch/
|
||||
repository boutofbranch
|
||||
branch master
|
||||
end match
|
||||
|
||||
# Everything else is discarded and ignored
|
||||
match /
|
||||
end match
|
||||
```
|
||||
|
||||
This file will quickly grow with the number of migration operations that you want to do. Ignore the
|
||||
files here if possible as it will reduce the migration time as well as the postprocessing that will
|
||||
need to be done afterwards. In my case, a number of files were too complex to match during the migration
|
||||
or were spotted only afterwards and had to be cleaned in a second pass with other tools.
|
||||
|
||||
### Migration
|
||||
This step will take a lot of time as it will read the whole svn history, process the declared rules and generate
|
||||
the git repositories and every commit.
|
||||
|
||||
```
|
||||
$ cd repositories
|
||||
$ ~/workspace/svn2git/svn-all-fast-export \
|
||||
--add-metadata \
|
||||
--svn-branches \
|
||||
--identity-map ~/workspace/migration-tools/accounts-map.txt \
|
||||
--rules ~/workspace/migration-tools/svnfast.rules \
|
||||
--commit-interval 2000 \
|
||||
--stat \
|
||||
/home/tsc/svn_myrepository
|
||||
```
|
||||
|
||||
If there is a crash during this step, it means that you are either missing an account in your mapping, that
|
||||
one of your rule is emitting an erroneous branch, repository or that no rule is matching.
|
||||
|
||||
Once this step finished, I like to do a btrfs snapshot so that I can return to this step when putting the
|
||||
next steps into place.
|
||||
|
||||
```
|
||||
btrfs subvolume snaphost -r repositories repositories/snap-1-import
|
||||
```
|
||||
|
||||
## Cleanup
|
||||
The next phase is to cleanup our import. There will always be a number of branches that are unused, named
|
||||
incorrectly, contain only temporary files or branches that are so far from the standard naming that our
|
||||
rules cannot process them correctly.
|
||||
|
||||
We will simply delete them or rename them using git.
|
||||
|
||||
```
|
||||
$ cd svn_myrepository
|
||||
$ git branch -D oldbranch-0.3.1
|
||||
$ git branch -D customer/backup_temp
|
||||
$ git branch -m customer/stable_v1.0 stable-1.0
|
||||
```
|
||||
|
||||
The goal at this step is to cleanup the branches that will be kept after
|
||||
the migration. We do this now to reduce the repository size early on and
|
||||
thus reduce the time needed for the next steps.
|
||||
|
||||
If you see branches that can be deleted or renamed further down the road,
|
||||
you can also remove or rename them then.
|
||||
|
||||
I like to take a snapshot at this stage as the next stage usually involves
|
||||
a lot of tests and manually building a list of things to remove.
|
||||
|
||||
```
|
||||
btrfs subvolume snaphost -r repositories repositories/snap-2a-cleanup
|
||||
```
|
||||
|
||||
We can also remove files that were added and should not have been by checking
|
||||
a list of every file every checked into our new git repository, inspecting
|
||||
it manually and add the identifiers of files to remove in a new file :
|
||||
|
||||
```sh
|
||||
$ git rev-list --objects --all > ./all-files
|
||||
$ cat ./all-files | your-filter | cut -d' ' -f1 > ./to-delete-ids
|
||||
$ java -jar ~/Downloads/bfg-1.12.15.jar --private --no-blob-protection --strip-blobs-with-ids ./to-delete-ids
|
||||
```
|
||||
|
||||
We will take a snapshot again, as the next step also involves checks and
|
||||
tests.
|
||||
|
||||
```
|
||||
btrfs subvolume snaphost -r repositories repositories/snap-2b-cleanup
|
||||
```
|
||||
|
||||
Next, we will convert the binary files that we still want to keep in our
|
||||
repository to Git-LFS. This allows git to only keep track of the hash of
|
||||
the file in the history and not store the whole binary in the repository,
|
||||
thus reducing the size of the clones.
|
||||
|
||||
BFG does this quickly and efficiently, removing every file matching the
|
||||
given name from the history and storing it in Git-LFS. This step will
|
||||
require some exploration of the previous `all-files` file to identify which
|
||||
files need to be converted.
|
||||
|
||||
```sh
|
||||
$ java -jar ~/Downloads/bfg-1.12.15.jar --no-blob-protection --private --convert-to-git-lfs 'my-important-archive*.zip'
|
||||
$ java -jar ~/Downloads/bfg-1.12.15.jar --no-blob-protection --private --convert-to-git-lfs '*.ear'
|
||||
```
|
||||
|
||||
After the cleanup, I also like to do a btrfs snapshot so that the history
|
||||
rewrite step can be executed and tested multiple times.
|
||||
|
||||
```
|
||||
btrfs subvolume snaphost -r repositories repositories/snap-2c-cleanup
|
||||
```
|
||||
|
||||
### Linking a svn revision to a git commit
|
||||
The logs prints for each revision a line mapping to a mark on the git marks file. In the git repository, there
|
||||
is then a marks file that map this mark to a commit hash. We can use this information to build a mapping database
|
||||
that can store that information for later.
|
||||
|
||||
In our case, I wrote a Java program that will parse both files and store
|
||||
the resulting mapping into a LevelDB database.
|
||||
|
||||
This database will then be used by a Golang server that will read this mapping
|
||||
database in memory and serve a RPC server that we will call from Golang
|
||||
binaries in a `git filter-branch` call. The Golang server will also need
|
||||
to keep track of the modifications to the git commit hashes as the history
|
||||
rewrite changes them.
|
||||
|
||||
First, the Java tool to read the logs and generate the LevelDB database :
|
||||
|
||||
```java
|
||||
import com.google.common.collect.BiMap;
|
||||
import com.google.common.collect.HashBiMap;
|
||||
import java.io.File;
|
||||
import java.io.FileOutputStream;
|
||||
import java.io.FileReader;
|
||||
import java.io.FileWriter;
|
||||
import java.io.PrintStream;
|
||||
import java.util.ArrayList;
|
||||
import java.util.Collection;
|
||||
import java.util.Collections;
|
||||
import java.util.HashMap;
|
||||
import java.util.LinkedHashMap;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.regex.Matcher;
|
||||
import java.util.regex.Pattern;
|
||||
import java.util.stream.Collectors;
|
||||
import org.apache.commons.io.FileUtils;
|
||||
import org.apache.commons.io.IOUtils;
|
||||
import org.apache.commons.io.filefilter.DirectoryFileFilter;
|
||||
import org.apache.commons.io.filefilter.IOFileFilter;
|
||||
import org.iq80.leveldb.DB;
|
||||
import org.iq80.leveldb.Options;
|
||||
import org.iq80.leveldb.impl.Iq80DBFactory;
|
||||
|
||||
public class CommitMapping {
|
||||
|
||||
public static String FILE_LOG_IMPORT = "../log-svn_myrepository";
|
||||
public static String FILE_MARKS = "marks-svn_myrepository";
|
||||
public static String FILE_BFG_DIR = "../svn_myrepository.bfg-report";
|
||||
|
||||
public static Pattern PATTERN_LOG = Pattern.compile("^progress SVN (r\\d+) branch .* = (:\\d+)");
|
||||
|
||||
public static void main(String[] args) throws Exception {
|
||||
|
||||
List<String> importLines = IOUtils.readLines(new FileReader(new File(FILE_LOG_IMPORT)));
|
||||
List<String> marksLines = IOUtils.readLines(new FileReader(new File(FILE_MARKS)));
|
||||
|
||||
Collection<File> passFilesCol = FileUtils.listFiles(new File(FILE_BFG_DIR), new IOFileFilter() {
|
||||
@Override
|
||||
public boolean accept(File pathname, String name) {
|
||||
return name.equals("object-id-map.old-new.txt");
|
||||
}
|
||||
|
||||
@Override
|
||||
public boolean accept(File path) {
|
||||
return this.accept(path, path.getName());
|
||||
}
|
||||
}, DirectoryFileFilter.DIRECTORY);
|
||||
|
||||
List<File> passFiles = new ArrayList<>(passFilesCol);
|
||||
|
||||
Collections.sort(passFiles, (File o1, File o2) -> o1.getParentFile().getName().compareTo(o2.getParentFile().getName()));
|
||||
|
||||
Map<String, String> commitToIdentifier = new LinkedHashMap<>();
|
||||
Map<String, String> identifierToHash = new HashMap<>();
|
||||
|
||||
for (String importLine : importLines) {
|
||||
Matcher marksMatch = PATTERN_LOG.matcher(importLine);
|
||||
|
||||
if (marksMatch.find()) {
|
||||
String dest = marksMatch.group(2);
|
||||
if (dest == null || dest.length() == 0 || ":0".equals(dest)) continue;
|
||||
|
||||
commitToIdentifier.put(marksMatch.group(1), dest);
|
||||
} else {
|
||||
System.err.println("Unknown line : " + importLine);
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
File dbFile = new File(System.getenv("HOME") + "/mapping-db");
|
||||
File humanFile = new File(System.getenv("HOME") + "/mapping");
|
||||
|
||||
FileUtils.deleteQuietly(dbFile);
|
||||
|
||||
Options options = new Options();
|
||||
options.createIfMissing(true);
|
||||
DB db = Iq80DBFactory.factory.open(dbFile, options);
|
||||
|
||||
marksLines.stream().map((line) -> line.split("\\s", 2)).forEach((parts) -> identifierToHash.put(parts[0], parts[1]));
|
||||
|
||||
BiMap<String, String> commitMapping = HashBiMap.create(commitToIdentifier.size());
|
||||
for (String commit : commitToIdentifier.keySet()) {
|
||||
|
||||
String importId = commitToIdentifier.get(commit);
|
||||
String hash = identifierToHash.get(importId);
|
||||
|
||||
if (hash == null) continue;
|
||||
commitMapping.put(commit, hash);
|
||||
}
|
||||
|
||||
System.err.println("Got " + commitMapping.size() + " svn -> initial import entries.");
|
||||
|
||||
for (File file : passFiles) {
|
||||
System.err.println("Processing file " + file.getAbsolutePath());
|
||||
|
||||
List<String> bfgPass = IOUtils.readLines(new FileReader(file));
|
||||
Map<String, String> hashMapping = bfgPass.stream().map((line) -> line.split("\\s", 2)).collect(Collectors.toMap(parts -> parts[0], parts -> parts[1]));
|
||||
|
||||
for (String hash : hashMapping.keySet()) {
|
||||
String rev = commitMapping.inverse().get(hash);
|
||||
if (rev != null) {
|
||||
String newHash = hashMapping.get(hash);
|
||||
System.err.println("Replacing r" + rev + ", was " + hash + ", is " + newHash);
|
||||
commitMapping.replace(rev, newHash);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
PrintStream fos = new PrintStream(humanFile);
|
||||
for (Map.Entry<String, String> entry : commitMapping.entrySet()) {
|
||||
String commit = entry.getKey();
|
||||
String target = entry.getValue();
|
||||
|
||||
fos.println(commit + "\t" + target);
|
||||
db.put(Iq80DBFactory.bytes(commit), Iq80DBFactory.bytes(target));
|
||||
}
|
||||
|
||||
db.close();
|
||||
fos.close();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
We will use RPC between a client and server to allow the LevelDB database
|
||||
to be kept open and have very light clients that query a running server
|
||||
as they will be executed for each commit. After some tests, opening the
|
||||
database was really time consuming thus this approach, even though the
|
||||
server will do very little.
|
||||
|
||||
The structure of our go project is the following :
|
||||
|
||||
```txt
|
||||
go-gitcommit/client-common:
|
||||
rpc.go
|
||||
|
||||
go-gitcommit/client-insert:
|
||||
insert-mapping.go
|
||||
|
||||
go-gitcommit/client-query:
|
||||
query-mapping.go
|
||||
|
||||
go-gitcommit/server:
|
||||
server.go
|
||||
```
|
||||
|
||||
First, some plumping for the RPC in `rpc.go` :
|
||||
|
||||
```go
|
||||
package Client
|
||||
|
||||
import (
|
||||
"net"
|
||||
"net/rpc"
|
||||
"time"
|
||||
)
|
||||
|
||||
type (
|
||||
// Client -
|
||||
Client struct {
|
||||
connection *rpc.Client
|
||||
}
|
||||
|
||||
// MappingItem is the response from the cache or the item to insert into the cache
|
||||
MappingItem struct {
|
||||
Key string
|
||||
Value string
|
||||
}
|
||||
|
||||
// BulkQuery allows to mass query the DB in one go.
|
||||
BulkQuery []MappingItem
|
||||
)
|
||||
|
||||
// NewClient -
|
||||
func NewClient(dsn string, timeout time.Duration) (*Client, error) {
|
||||
connection, err := net.DialTimeout("tcp", dsn, timeout)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return &Client{connection: rpc.NewClient(connection)}, nil
|
||||
}
|
||||
|
||||
// InsertMapping -
|
||||
func (c *Client) InsertMapping(item MappingItem) (bool, error) {
|
||||
var ack bool
|
||||
err := c.connection.Call("RPC.InsertMapping", item, &ack)
|
||||
return ack, err
|
||||
}
|
||||
|
||||
// GetMapping -
|
||||
func (c *Client) GetMapping(bulk BulkQuery) (BulkQuery, error) {
|
||||
var bulkResponse BulkQuery
|
||||
err := c.connection.Call("RPC.GetMapping", bulk, &bulkResponse)
|
||||
return bulkResponse, err
|
||||
}
|
||||
```
|
||||
|
||||
Next the Golang server that will read this database in `server.go` :
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"net"
|
||||
"net/rpc"
|
||||
"os"
|
||||
"time"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb"
|
||||
|
||||
Client "../client-common"
|
||||
)
|
||||
|
||||
var (
|
||||
cacheDBPath = os.Getenv("HOME") + "/mapping-db"
|
||||
|
||||
cacheDB *leveldb.DB
|
||||
flowMap map[string]string
|
||||
|
||||
f *os.File
|
||||
g *os.File
|
||||
)
|
||||
|
||||
type (
|
||||
// RPC is the base class of our RPC system
|
||||
RPC struct {
|
||||
}
|
||||
)
|
||||
|
||||
func main() {
|
||||
var cacheDBerr error
|
||||
|
||||
cacheDB, cacheDBerr = leveldb.OpenFile(cacheDBPath, nil)
|
||||
if cacheDBerr != nil {
|
||||
fmt.Fprintln(os.Stderr, "Unable to initialize the LevelDB cache.")
|
||||
log.Fatal(cacheDBerr)
|
||||
}
|
||||
|
||||
roErr := cacheDB.SetReadOnly()
|
||||
if roErr != nil {
|
||||
fmt.Fprintln(os.Stderr, "Unable to initialize the LevelDB cache.")
|
||||
log.Fatal(roErr)
|
||||
}
|
||||
|
||||
flowMap = make(map[string]string)
|
||||
|
||||
f, _ = os.Create(os.Getenv("HOME") + "/go-server/gomapping.log")
|
||||
defer f.Close()
|
||||
g, _ = os.Create(os.Getenv("HOME") + "/go-server/gomapping.ins")
|
||||
defer g.Close()
|
||||
|
||||
rpc.Register(NewRPC())
|
||||
|
||||
l, e := net.Listen("tcp", ":9876")
|
||||
if e != nil {
|
||||
log.Fatal("listen error:", e)
|
||||
}
|
||||
|
||||
go flushLog()
|
||||
|
||||
rpc.Accept(l)
|
||||
}
|
||||
|
||||
func flushLog() {
|
||||
for {
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
f.Sync()
|
||||
}
|
||||
}
|
||||
|
||||
// NewRPC -
|
||||
func NewRPC() *RPC {
|
||||
return &RPC{}
|
||||
}
|
||||
|
||||
// InsertMapping -
|
||||
func (r *RPC) InsertMapping(mappingItem Client.MappingItem, ack *bool) error {
|
||||
old := mappingItem.Key
|
||||
new := mappingItem.Value
|
||||
|
||||
flowMap[old] = new
|
||||
|
||||
g.WriteString(fmt.Sprintf("Inserted mapping %s -> %s\n", old, new))
|
||||
|
||||
*ack = true
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// GetMapping -
|
||||
func (r *RPC) GetMapping(bulkQuery Client.BulkQuery, resp *Client.BulkQuery) error {
|
||||
for i := range bulkQuery {
|
||||
key := bulkQuery[i].Key
|
||||
|
||||
response, _ := cacheDB.Get([]byte(key), nil)
|
||||
|
||||
gitCommit := key
|
||||
if response != nil {
|
||||
responseStr := string(response[:])
|
||||
responseUpdated := flowMap[responseStr]
|
||||
if responseUpdated != "" {
|
||||
gitCommit = string(responseUpdated[:])[:12] + "(" + key + ")"
|
||||
|
||||
f.WriteString(fmt.Sprintf("Response to mapping %s -> %s\n", bulkQuery[i].Key, gitCommit))
|
||||
} else {
|
||||
f.WriteString(fmt.Sprintf("No git mapping for entry %s\n", responseStr))
|
||||
}
|
||||
} else {
|
||||
f.WriteString(fmt.Sprintf("Unknown revision %s\n", key))
|
||||
}
|
||||
|
||||
bulkQuery[i].Value = gitCommit
|
||||
}
|
||||
|
||||
*resp = bulkQuery
|
||||
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
And finally our clients. The insert client will be called from `git filter-branch`
|
||||
with the previous and current commit hashes after processing each commit. We
|
||||
store this information into the database so that the hashes are correct when
|
||||
mapping a revision. The code goes into `insert-mapping.go` :
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"os"
|
||||
"time"
|
||||
|
||||
Client "../client-common"
|
||||
)
|
||||
|
||||
func main() {
|
||||
old := os.Args[1]
|
||||
new := os.Args[2]
|
||||
|
||||
rpcClient, err := Client.NewClient("localhost:9876", time.Millisecond*500)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
mappingItem := Client.MappingItem{
|
||||
Key: old,
|
||||
Value: new,
|
||||
}
|
||||
|
||||
ack, err := rpcClient.InsertMapping(mappingItem)
|
||||
if err != nil || !ack {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
fmt.Println(new)
|
||||
}
|
||||
```
|
||||
|
||||
The query client will receive the commit message for each commit, check
|
||||
whether it contains a `r` mapping and query the server for a hash for this
|
||||
commit. It goes into `query-mapping.go` :
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"fmt"
|
||||
"log"
|
||||
"os"
|
||||
"regexp"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
client "../client-common"
|
||||
)
|
||||
|
||||
func main() {
|
||||
reader := bufio.NewReader(os.Stdin)
|
||||
text, _ := reader.ReadString('\n')
|
||||
|
||||
re := regexp.MustCompile(`\Wr[0-9]+`)
|
||||
matches := re.FindAllString(text, -1)
|
||||
|
||||
if matches == nil {
|
||||
fmt.Print(text)
|
||||
return
|
||||
}
|
||||
|
||||
rpcClient, err := client.NewClient("localhost:9876", time.Millisecond*500)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
var bulkQuery client.BulkQuery
|
||||
|
||||
for i := range matches {
|
||||
if matches[i][0] != '-' {
|
||||
key := matches[i][1:]
|
||||
bulkQuery = append(bulkQuery, client.MappingItem{Key: key})
|
||||
}
|
||||
}
|
||||
|
||||
gitCommits, _ := rpcClient.GetMapping(bulkQuery)
|
||||
|
||||
for i := range gitCommits {
|
||||
gitCommit := gitCommits[i].Value
|
||||
key := gitCommits[i].Key
|
||||
|
||||
text = strings.Replace(text, key, gitCommit, 1)
|
||||
}
|
||||
|
||||
fmt.Print(text)
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
For this step, we will need to first compile and execute the Java program.
|
||||
Once it succeeded in creating the database, we will compile and execute
|
||||
the Go server in the background.
|
||||
|
||||
Then, we can launch `git filter-branch` on our repository to rewrite the
|
||||
history :
|
||||
|
||||
```sh
|
||||
$ git filter-branch \
|
||||
--commit-filter 'NEW=`git_commit_non_empty_tree "$@"`; \
|
||||
${HOME}/migration-tools/go-gitcommit/client-insert/client-insert $GIT_COMMIT $NEW' \
|
||||
--msg-filter "${HOME}/migration-tools/go-gitcommit/client-query/client-query" \
|
||||
-- --all --author-date-order
|
||||
```
|
||||
|
||||
As after each step, we will generate a snapshot, even though it should be
|
||||
the last step that cannot be repeated easily.
|
||||
|
||||
```
|
||||
btrfs subvolume snaphost -r repositories repositories/snap-3-mapping
|
||||
```
|
||||
|
||||
We now clean the repository that should contain a lot of unused blobs,
|
||||
branches, commits, ...
|
||||
|
||||
```sh
|
||||
$ git reflog expire --expire=now --all
|
||||
$ git prune --expire=now --progress
|
||||
$ git repack -adf --window-memory=512m
|
||||
```
|
||||
|
||||
We now have a repository that should be more or less clean. You will have
|
||||
to check the history, the size of the blobs and whether some branches can
|
||||
still be deleted before pushing it to your server.
|
304
articles/2019-08-17-my-git-workflow.md
Normal file
304
articles/2019-08-17-my-git-workflow.md
Normal file
|
@ -0,0 +1,304 @@
|
|||
---
|
||||
title: My Git workflow
|
||||
date: 2019-08-17 16:00:00
|
||||
---
|
||||
|
||||
[Git](https://git-scm.com/) is currently the most popular Version Control
|
||||
System and probably needs no introduction. I have been using it for some
|
||||
years now, both for work and for personal projects.
|
||||
|
||||
Before that, I used Subversion for nearly 10 years and was more or less
|
||||
happy with it. More or less because it required to be online to do more
|
||||
or less anything : Commit needs to be online, logs needs to be online,
|
||||
checking out an older revision needs to be online, ...
|
||||
|
||||
Git does not require anything online (except, well, `git push` and `git pull/fetch`
|
||||
for obvious reasons). Branching is way easier in Git also, allowing you to work
|
||||
offline on some feature on your branch, commit when you need to and then push your
|
||||
work when online. It was a pleasure to discover these features and the
|
||||
workflow that derived from this.
|
||||
|
||||
This article will describe my workflow using Git and is not a tutorial or
|
||||
a guide on using Git. It will also contain my Git configuration that matches
|
||||
this workflow but could be useful for others.
|
||||
|
||||
## Workflow
|
||||
|
||||
This workflow comes heavily from the [GitHub Flow](https://guides.github.com/introduction/flow/index.html)
|
||||
and the [GitLab Flow](https://docs.gitlab.com/ee/topics/gitlab_flow.html).
|
||||
|
||||
These workflows are based on branches coming out of master and being
|
||||
merged back into the master on completion. I found the [Git Flow](https://nvie.com/posts/a-successful-git-branching-model/)
|
||||
to be too complicated for my personal projects and extending the GitHub Flow
|
||||
with a set of stable branches and tags has worked really well at work, like
|
||||
described in the [Release branches with GitLab flow](https://docs.gitlab.com/ee/topics/gitlab_flow.html#release-branches-with-gitlab-flow).
|
||||
|
||||
|
||||
### 1. Create a new branch.
|
||||
|
||||
I always create a new branch when starting something.
|
||||
This allows switching easily between tasks if some urgent work is coming in without
|
||||
having to pile up modifications in the stash.
|
||||
|
||||
When working of personal projects, I tend to be more lax about these branches,
|
||||
creating a branch that will contain more than one change and review them
|
||||
all in one go afterwards.
|
||||
|
||||
Why create a branch and not commit directly into the master ? Because you
|
||||
want tests to check that your commits are correct before the changes are
|
||||
written in stone. A branch can be modified or deleted, the master branch
|
||||
cannot. Even for small projects, I find that branches allow you to work
|
||||
more peacefully, allowing you to iterate on your work.
|
||||
|
||||
A branch is created by `git checkout -b my-branch` and can immediately be used
|
||||
to commit things.
|
||||
|
||||
### 2. Commit often.
|
||||
This advice comes everytime on Git: You can commit anytime, anything.
|
||||
It is way easier to squash commits together further down the line than it is to
|
||||
split a commit 2 days after the code was written.
|
||||
|
||||
Your commits are still local only so have no fear committing incomplete or
|
||||
what you consider sub-par code that you will refine later. With that come the next points.
|
||||
|
||||
### 3. Add only the needed files.
|
||||
With Git you can and must add files before
|
||||
your commit. When working on large projects, you will modify multiple files.
|
||||
When commiting you can add one file to the index, commit changes to this file,
|
||||
add the second file to the index and commit these changes in a second commit.
|
||||
Git add also allows you to add only parts of a file with `git add -p`. This
|
||||
can be useful if you forgot to commit a step before starting work on the
|
||||
next step.
|
||||
|
||||
### 4. Write useful commit messages.
|
||||
Even though your commits are not yet published, commit messages are also
|
||||
useful for you.
|
||||
|
||||
I won't give you advice on how to write a commit message as this depends
|
||||
on the projects and the team I'm working on, but remember that a commit
|
||||
message is something to describe *what* you did and *why*.
|
||||
|
||||
Here are some rules I like to follow :
|
||||
|
||||
1. Write a short description of *why* and *what*. Your commit message
|
||||
should be short but explain both. A `git log --oneline` should produce
|
||||
a readable log that tells you what happened.
|
||||
2. Be precise. You polished up your cache prototype ? Don't write *General polishing*,
|
||||
say *what* and *why*, like *Polishing the Redis caching prototype*.
|
||||
3. Be concise. You fixed tests that were failing because of the moon and
|
||||
planets alignment and solar flares ? Don't write a novel on one line like
|
||||
*Adding back the SmurfVillageTest after fixing the planet alignement and
|
||||
the 100th Smurf was introduced through a mirror and everybody danced happily
|
||||
ever after*. The longest I would go for is *Fixed failing SmurfVillageTest for 100th Smurf*
|
||||
4. Use the other lines. You can do a multi-line commit message if you need
|
||||
to explain the context in details. Treat your commit like you would an
|
||||
email: Short subject, Long message if needed.
|
||||
The Linux kernel is generally a really good example of good long commit messages, like
|
||||
[cramfs: fix usage on non-MTD device](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3e5aeec0e267d4422a4e740ce723549a3098a4d1)
|
||||
or
|
||||
[bpf, x86: Emit patchable direct jump as tail call](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=428d5df1fa4f28daf622c48dd19da35585c9053c).
|
||||
5. In any case, don't write messages like *Update presentation* in 10
|
||||
different commits, or even worse *Fix stuff*. It's not useful, neither for
|
||||
your nor your colleagues.
|
||||
|
||||
Here are some links about commit messages. Don't ignore this, in my opinion
|
||||
it is a really important part of every VCS:
|
||||
|
||||
* [Commit Often, Perfect Later, Publish Once - Do make useful commit messages](https://sethrobertson.github.io/GitBestPractices/#usemsg)
|
||||
* [A Note About Git Commit Messages](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)
|
||||
* [Introduction: Why good commit messages matter](https://chris.beams.io/posts/git-commit/)
|
||||
|
||||
### 5. Refine your commits.
|
||||
At this point, if you coded as a normal human being,
|
||||
you will have a large number of commits, with some that introduce new
|
||||
small features, like *Add cache to build process*, some that fix typos,
|
||||
like *Fix typo in cache configuration key*, some others that add some missing
|
||||
library, like *Oops, forgot to add the Redis library to the pom*. Nothing
|
||||
to worry about, to err is human, computers are there to catch them and allow
|
||||
you to fix them easily.
|
||||
|
||||
Before pushing the work online, I like to [hide the sausage making](https://sethrobertson.github.io/GitBestPractices/#sausage).
|
||||
Personally, I find that the downsides are outweighted by the fact that you
|
||||
reduce the time needed to commit things while coding and organize stuff
|
||||
once your mind is free of code-related thoughts.
|
||||
|
||||
These commits are not useful for other people, they are only there because
|
||||
you made a mistake. No shame in that but the reviewers don't need to see
|
||||
these, they need to have a clear view of *what* and *why*.
|
||||
|
||||
The cache library was added because we added a cache, the configuration
|
||||
key is there because we added a cache. The commits should reflect our work,
|
||||
not your mistakes. In this example, I would only keep one commit, *Add cache
|
||||
to the build process* and squash the errors into it.
|
||||
|
||||
At this step, I like to rebase my branch on the current master
|
||||
with `git rebase -i origin/master` so that I can reorder and squash commits
|
||||
as well as get the latest changes in my branch.
|
||||
|
||||
### 6. Rebase your branch
|
||||
Usually, before your work on a feature is finished, a number of changes
|
||||
landed on the master branch: New features, fixes, perhaps new tests if
|
||||
you are lucky. Before pushing, I thus do a quick `git fetch && git rebase origin/master`,
|
||||
just so that my branch is up to date with the branch I will merge to.
|
||||
|
||||
With the lastest changes in my branch, I like to run the test suite one
|
||||
last time.
|
||||
|
||||
### 7. Check your commit messages
|
||||
Before pushing, I like to do a quick `git log --oneline` to check my
|
||||
commits.
|
||||
|
||||
Your change descriptions should make sense, you should be able at this
|
||||
point to remember for each commit what changed and why you did it and the
|
||||
message should reflect this.
|
||||
|
||||
If one commit message is vague, this is the last chance to rewrite it. I
|
||||
usually do that with an interactive rebase: `git rebase origin/master -i`.
|
||||
|
||||
### 8. Pushing the branch
|
||||
Once everything is in order, the branch can be pushed, a Pull Request/Merge request/Review request
|
||||
can be opened and other people brought into the changes.
|
||||
|
||||
### 9. Review
|
||||
If you work in a team you will have a code review step before merging changes.
|
||||
I like to see this as a step to ensure that I did not miss anything. When you
|
||||
code your fix or features, it is really easy to forget some corner-case or
|
||||
some business requirement that was introduced by an another customer to a
|
||||
colleague. I like to see the review step as peace of mind that you did not
|
||||
forget something important and that if you forgot something, it was not
|
||||
that important as 4 eyes did not spot it.
|
||||
|
||||
The review is also a way for your colleagues to keep up to date with your
|
||||
work. Whatever is in the master branch has been seen by 2 people and should
|
||||
be understood by 2 people. It's important to have someone else that can
|
||||
fix that part of the code in case you are absent.
|
||||
|
||||
These people will need to quickly know what changed and why you changed
|
||||
that. Usually the tooling will quickly allow people to check what changed,
|
||||
comment on those changes and request improvements. The why will come from
|
||||
your commit messages.
|
||||
|
||||
I also like to keep this step even when working alone. I review my own
|
||||
code to ensure that the changes are clear and that I committed everything
|
||||
I needed to and only what I wanted to.
|
||||
|
||||
### 10. Changes
|
||||
Usually you will have to change some parts after the review. It can be
|
||||
because you remembered something walking down the corridor to get tea or
|
||||
because your colleagues saw possible improvements.
|
||||
|
||||
For these changes, I like to follow the same procedure as before. Write
|
||||
the changes, commit them, fix the old commits to keep the log clean. I
|
||||
see the review as part of the work, not something that comes after
|
||||
and will be recorded in the logs. In short :
|
||||
|
||||
* A new feature is requested by the reviewer ? New commit.
|
||||
* A typo must be fixed ? Fix the commit that introduced it.
|
||||
* Some CI test fails ? Fix the commit that introduced the regression or introduce
|
||||
a new commit to fix the test.
|
||||
|
||||
## The dangers
|
||||
This workflow is `rebase` heavy. If you have some modifications that
|
||||
conflict with your changes, you will have to resolve the conflicts, perhaps
|
||||
on multiple commits during the rebase, with the possible errors that will
|
||||
come out of it. If the conflicts are too much, you can always abort the
|
||||
rebase and try to reorder your commits to reduce the conflicts, if possible.
|
||||
|
||||
The fact that you rebase will also hide the origin of problems coming from
|
||||
your parent branch. If you pull code with failing tests, you will have
|
||||
nothing in the history that tells you that your code worked before pulling
|
||||
the changes. Only your memory (and the `reflog` but who checks the `reflog` ?)
|
||||
will tell you that it worked before, there are no commit marking the before
|
||||
and the after like there would be on a `merge` workflow. On tools like
|
||||
GitLab, you will see that there were pipelines that were succeeding and then
|
||||
a pipeline failing but you will need to check the changes between the
|
||||
succeeding and the failing pipelines.
|
||||
|
||||
If you are not alone on your branch, rebasing can cause a lot of issues when
|
||||
pulling and pushing with two rebased branches with different commits in it.
|
||||
Be sure to only rebase when everyone has committed everything and the branch
|
||||
is ready to be reviewed and merged.
|
||||
|
||||
## Git aliases
|
||||
Since I do some operations a number of times each day, I like to simplify
|
||||
them by using aliases in my `.gitconfig`.
|
||||
|
||||
The first two are aliases to check the logs before pushing the changes.
|
||||
They print a one-liner for each commit, one without merge commits, the
|
||||
other with merge commits and a graph of the branches.
|
||||
|
||||
The last two are aliases for branch creation and publication. Instead
|
||||
having to know whether I have to create a new branch or can directly
|
||||
checkout an existing branch, I wrote this alias to `go` to the branch,
|
||||
creating it if needed. The `publish` alias allows to push a branch created
|
||||
locally to the origin without having to specify anything.
|
||||
|
||||
The `commit-oups` is a short-hand to amend the last commit without changing
|
||||
the commit message. It happens often that I forgot to add a file to the
|
||||
index, or committed too early, or forgot to run the tests, or forgot
|
||||
a library. This alias allows me to do a `git add -u && git commit-oups`
|
||||
in these cases. (Yes, Oups is french for Oops).
|
||||
|
||||
```ini
|
||||
[alias]
|
||||
# Shorthand to print a graph log with oneliner commit messages.
|
||||
glog = log --graph --pretty=format:'%C(yellow)[%ad]%C(reset) %C(green)[%h]%C(reset) %s %C(red)[%an]%C(blue)%d%C(reset)' --date=short
|
||||
|
||||
# Shorthand to print a log with onliner commit messages ignoring merge commits.
|
||||
slog = log --no-merges --pretty=format:'%C(yellow)[%ad]%C(reset) %C(green)[%h]%C(reset) %s %C(red)[%an]%C(blue)%d%C(reset)' --date=short
|
||||
|
||||
# Prints out the current branch. This alias is used for other aliases.
|
||||
branch-name = "!git rev-parse --abbrev-ref HEAD"
|
||||
|
||||
# Shorthand to amend the last commit without changing the commit message.
|
||||
commit-oups = commit --amend --no-edit
|
||||
|
||||
# Shorthand to facilitate the remote creation of new branches. This allow
|
||||
# the user to push a new branch on the origin easily.
|
||||
publish = "!git push -u origin $(git branch-name)"
|
||||
|
||||
# Shorthand to faciliate the creation of new branches. This switch to
|
||||
# the given branch, creating it if necessary.
|
||||
go = "!go() { git checkout -b $1 2> /dev/null|| git checkout $1; }; go"
|
||||
|
||||
```
|
||||
|
||||
## Releases
|
||||
This article only detailled the daily work on a feature and the
|
||||
merge but did not go into detail into the release process. This is deliberate
|
||||
as every release is different. In my personal projects alone I have multiple
|
||||
ways to represent releases.
|
||||
|
||||
On my blog there are no releases, everything on the master is published
|
||||
as they are merged.
|
||||
|
||||
On my Kubernetes project, a release is something more precise but not
|
||||
static. I want to be sure that it works but it can be updated easily.
|
||||
It is thus represented by a single stable branch that I merge the master
|
||||
onto once I want to deploy the changes.
|
||||
|
||||
On my keyboard project, a release is something really static
|
||||
as it represents a PCB, an object that cannot be updated easily. It is
|
||||
thus a tag with the PCB order reference. Once the firmware is introduced,
|
||||
this could change with the introduction of a stable branch that will follow
|
||||
the changes to the firmware and configuration. Or I could continue using tags,
|
||||
this will be decided once the hardware is finished.
|
||||
|
||||
## Conclusion
|
||||
As always with Git, the tool is so powerful that more or less any workflow
|
||||
can work with it. There are a number of possible variations on this, with
|
||||
each team having a favorite way of doing things.
|
||||
|
||||
In this article I did not talk about tooling but nowadays with CI/CD
|
||||
being more and more important, tooling is an important part of the workflow.
|
||||
Tests will need to be run on branches, perhaps `stable` branches will
|
||||
have more tests that `feature` branches due to server/time/financial limitations.
|
||||
Perhaps you have Continuous Deployment of stable branches, perhaps you want
|
||||
to Continuously reDeploy a developement server when code is merged on the
|
||||
master.
|
||||
|
||||
Your tooling will need a clear a clear flow. If you have conventions that
|
||||
new features are developed on branches that have a `feature/` prefix, everybody
|
||||
must follow that otherwise the work to reconcile this in your tooling will
|
||||
be daunting for the developer in charge of these tools.
|
||||
|
111
articles/2019-11-21-borg-migration.md
Normal file
111
articles/2019-11-21-borg-migration.md
Normal file
|
@ -0,0 +1,111 @@
|
|||
---
|
||||
title: Backup archives migration to Borg
|
||||
date: 2019-11-21 18:30:00
|
||||
---
|
||||
|
||||
Last weekend I found a number of encrypted hard-drives that were used to do periodic
|
||||
backups from 2006 to 2014. At the time, the backups were made using rsync
|
||||
with hard-links to save only one occurrence of the file if it was not changed
|
||||
since the last backup.
|
||||
|
||||
I wanted to check that everything was still there and upgrade this to a
|
||||
Borg repository so that I can profit from compression and deduplication
|
||||
to reduce these backup size further down and store them in a more secure way.
|
||||
|
||||
## Check the backups
|
||||
The backups were made using hard-links with one backup corresponding to
|
||||
one folder as follow :
|
||||
|
||||
```
|
||||
$ ls backups/feronia/
|
||||
back-2014-06-19T19:05:10/ back-2014-10-10T07:30:00/
|
||||
back-2014-12-24T14:34:44/ current@
|
||||
```
|
||||
|
||||
To check that the backups were still readable, I listed the content of
|
||||
the different folders and checked that some known configuration files were
|
||||
present and matched what was expected. This worked until I processed some
|
||||
backups done before I was using awesomewm, when I changed a lot of config
|
||||
files to match my usage instead of using the default ones.
|
||||
|
||||
All in all, the backups were still good and readable, I could use these
|
||||
as a basis for the transition to a more robust and space-efficient system.
|
||||
|
||||
I saw a number of freezes during the check I interpreted as signs of old
|
||||
age for the spinning rust.
|
||||
|
||||
## Initialize the Borg backup
|
||||
The first step is to initialize the borg repository. We will put it on
|
||||
one of the known good backup drive that still has some room. To estimate
|
||||
the space needed for the backups, I took the size of the most recent backup
|
||||
and multiplied by two, as I know that I did not delete a lot of files and
|
||||
that the deduplication will reduce the size of the old backups that contained
|
||||
a lot of checked-out subversion repositories.
|
||||
|
||||
So, with a destination for my borg repository, I created a folder on the
|
||||
disk and gave my user read-write rights on this folder.
|
||||
|
||||
```
|
||||
$ sudo mkdir backups/borg-feronia
|
||||
$ sudo chown 1000:1000 backups/borg-feronia -R
|
||||
```
|
||||
|
||||
Then, the creation of the repository with borg :
|
||||
|
||||
```
|
||||
$ borg init --encryption=repokey backups/borg-feronia
|
||||
Enter new passphrase:
|
||||
Enter same passphrase again:
|
||||
Do you want your passphrase to be displayed for verification? [yN]: n
|
||||
[...]
|
||||
```
|
||||
|
||||
I decided to use the `repokey` encryption mode. This mode stores the key
|
||||
in the repository, allowing me to only remember the passphrase and not having
|
||||
to worry about backuping the key file.
|
||||
|
||||
## Transfer the existing backups to Borg
|
||||
The borg repository has been initialized, we can now start migrating the
|
||||
backups from the hard-linked folders into borg.
|
||||
|
||||
As borg does not care about hard-links, we can simply loop over the different
|
||||
folders and create a new archive from it. It will take some time because
|
||||
in each directory it will loop over the whole content, hash it, check whether
|
||||
it changed, deduplicate it, compress it and then write it. Each backup
|
||||
of approximately 70 GiB took one hour to migrate on my computer. It seems
|
||||
that the process is limited by the single-thread performance of your CPU.
|
||||
|
||||
```
|
||||
$ export BORG_PASSPHRASE=asdf
|
||||
$ for i in back*; do \
|
||||
archivename=$(echo $i | cut -c 6-15); \
|
||||
pushd $i; \
|
||||
borg create --stats --progress ~/backups/borg-feronia::$archivename .; \
|
||||
popd; \
|
||||
done;
|
||||
```
|
||||
|
||||
The env variable will allow us to walk away at this stage and let the computer
|
||||
do its magic for some hours.
|
||||
|
||||
## Check the migrated backups
|
||||
Once the backups have been migrated, we need to check that everything is
|
||||
in order before doing anything else.
|
||||
|
||||
I did the same as before, using this time `borg list` and `borg extract`
|
||||
to check whether the files are present and their content is correct.
|
||||
|
||||
|
||||
## Archive these backups
|
||||
Once the migrated backups have been tested, we can shred the old hard drives
|
||||
that were showing signs of old age.
|
||||
|
||||
Since storage is so cheap nowadays, I will also transfer an archive of
|
||||
the Borg backup folder to an online storage service as to be able
|
||||
to retrieve it in case the local storage supports are destroyed or otherwise
|
||||
unreadable in the future.
|
||||
|
||||
I choose to simply create a tar archive of the Borg folder and upload it
|
||||
to AWS S3 since these backups will not be updated. Perhaps some day I will
|
||||
add the more recent backups to this setup but for now they are a read-only
|
||||
window into the laptop I had during my studies and during my first jobs.
|
|
@ -9,17 +9,28 @@ new things. I earned a Master’s Degree in Computer Science graduating from
|
|||
the EPFL (Swiss Federal Institute of Technology in Lausanne), with a
|
||||
specialization in Software Systems.
|
||||
|
||||
When not programming for my job, I like to [design](https://git.inf3.xyz/tschwery/custom-keyboard) and build mechanical keyboards,
|
||||
[shooting my bow](https://les-archers-du-bisse.ch/) and
|
||||
[cooking](https://recettes.inf3.ch) with my family.
|
||||
When not programming for my job, I like to
|
||||
[design](https://git.inf3.xyz/tschwery/custom-keyboard) and build mechanical keyboards,
|
||||
[shooting my bow](https://les-archers-du-bisse.ch/),
|
||||
[cooking](https://recettes.inf3.ch) with my family
|
||||
and learn new things by coding random and not-so-random projects.
|
||||
|
||||
This blog is a collection of things learnt during these explorations.
|
||||
|
||||
## My job
|
||||
|
||||
I have been working as a Software Developer at [SAI-ERP](https://sai-erp.net) since 2011.
|
||||
I have been working as a Software Developer at [Groupe T2i](https://groupe-t2i.com)
|
||||
since 2019.
|
||||
|
||||
I have previously worked as a student assistant at [EPFL](https://ic.epfl.ch).
|
||||
I have previously worked
|
||||
as a Software Developer at [SAI-ERP](https://sai-erp.net) from 2011 to 2019
|
||||
and as a student assistant at [EPFL](https://ic.epfl.ch) from 2009 to 2012.
|
||||
|
||||
## Contact me
|
||||
|
||||
Find me on [Github](https://github.com/tschwery/) / my private [GitLab instance](https://git.inf3.xyz/explore/projects) / [Linkedin](www.linkedin.com/in/thomas-schwery) or just say by email at [thomas@inf3.ch](mailto:thomas@inf3.ch).
|
||||
Find me on
|
||||
[Github](https://github.com/tschwery/)
|
||||
/ my private [GitLab instance](https://git.inf3.xyz/explore/projects)
|
||||
/ [Linkedin](https://www.linkedin.com/in/thomas-schwery)
|
||||
or just say Hi by email at [thomas@inf3.ch](mailto:thomas@inf3.ch).
|
||||
|
||||
|
|
|
@ -12,6 +12,9 @@ title = "Thomas Schwery"
|
|||
author = "Thomas Schwery"
|
||||
copyright = "Thomas Schwery, No rights reserved (CC0)."
|
||||
|
||||
pygmentsCodeFences = true
|
||||
pygmentsCodeFencesGuessSyntax = true
|
||||
|
||||
[params]
|
||||
logo = "/images/logo.png"
|
||||
subtitle = "A Glog ... Plog ... Blog ..."
|
||||
|
|
|
@ -1,6 +1,7 @@
|
|||
body {
|
||||
font-family: "Roboto", "HelveticaNeue", "Helvetica Neue", Helvetica, Arial, sans-serif;
|
||||
background-color: #FCFCFC;
|
||||
text-align: justify;
|
||||
}
|
||||
|
||||
h1 { font-size: 2.1rem; }
|
||||
|
|
17
themes/hugo-tschwery/assets/css/skeleton.css
vendored
17
themes/hugo-tschwery/assets/css/skeleton.css
vendored
|
@ -303,23 +303,6 @@ ol ul {
|
|||
li {
|
||||
margin-bottom: 1rem; }
|
||||
|
||||
|
||||
/* Code
|
||||
–––––––––––––––––––––––––––––––––––––––––––––––––– */
|
||||
code {
|
||||
padding: .2rem .5rem;
|
||||
margin: 0 .2rem;
|
||||
font-size: 90%;
|
||||
white-space: nowrap;
|
||||
background: #F1F1F1;
|
||||
border: 1px solid #E1E1E1;
|
||||
border-radius: 4px; }
|
||||
pre > code {
|
||||
display: block;
|
||||
padding: 1rem 1.5rem;
|
||||
white-space: pre; }
|
||||
|
||||
|
||||
/* Tables
|
||||
–––––––––––––––––––––––––––––––––––––––––––––––––– */
|
||||
th,
|
||||
|
|
59
themes/hugo-tschwery/assets/css/syntax.css
Normal file
59
themes/hugo-tschwery/assets/css/syntax.css
Normal file
|
@ -0,0 +1,59 @@
|
|||
/* Background */ .chroma { color: #93a1a1; background-color: #002b36 }
|
||||
/* Other */ .chroma .x { color: #cb4b16 }
|
||||
/* LineTableTD */ .chroma .lntd { vertical-align: top; padding: 0; margin: 0; border: 0; }
|
||||
/* LineTable */ .chroma .lntable { border-spacing: 0; padding: 0; margin: 0; border: 0; width: auto; overflow: auto; display: block; }
|
||||
/* LineHighlight */ .chroma .hl { display: block; width: 100%;background-color: #ffffcc }
|
||||
/* LineNumbersTable */ .chroma .lnt { margin-right: 0.4em; padding: 0 0.4em 0 0.4em; }
|
||||
/* LineNumbers */ .chroma .ln { margin-right: 0.4em; padding: 0 0.4em 0 0.4em; }
|
||||
/* Keyword */ .chroma .k { color: #719e07 }
|
||||
/* KeywordConstant */ .chroma .kc { color: #cb4b16 }
|
||||
/* KeywordDeclaration */ .chroma .kd { color: #268bd2 }
|
||||
/* KeywordNamespace */ .chroma .kn { color: #719e07 }
|
||||
/* KeywordPseudo */ .chroma .kp { color: #719e07 }
|
||||
/* KeywordReserved */ .chroma .kr { color: #268bd2 }
|
||||
/* KeywordType */ .chroma .kt { color: #dc322f }
|
||||
/* NameBuiltin */ .chroma .nb { color: #b58900 }
|
||||
/* NameBuiltinPseudo */ .chroma .bp { color: #268bd2 }
|
||||
/* NameClass */ .chroma .nc { color: #268bd2 }
|
||||
/* NameConstant */ .chroma .no { color: #cb4b16 }
|
||||
/* NameDecorator */ .chroma .nd { color: #268bd2 }
|
||||
/* NameEntity */ .chroma .ni { color: #cb4b16 }
|
||||
/* NameException */ .chroma .ne { color: #cb4b16 }
|
||||
/* NameFunction */ .chroma .nf { color: #268bd2 }
|
||||
/* NameTag */ .chroma .nt { color: #268bd2 }
|
||||
/* NameVariable */ .chroma .nv { color: #268bd2 }
|
||||
/* LiteralString */ .chroma .s { color: #2aa198 }
|
||||
/* LiteralStringAffix */ .chroma .sa { color: #2aa198 }
|
||||
/* LiteralStringBacktick */ .chroma .sb { color: #586e75 }
|
||||
/* LiteralStringChar */ .chroma .sc { color: #2aa198 }
|
||||
/* LiteralStringDelimiter */ .chroma .dl { color: #2aa198 }
|
||||
/* LiteralStringDouble */ .chroma .s2 { color: #2aa198 }
|
||||
/* LiteralStringEscape */ .chroma .se { color: #cb4b16 }
|
||||
/* LiteralStringInterpol */ .chroma .si { color: #2aa198 }
|
||||
/* LiteralStringOther */ .chroma .sx { color: #2aa198 }
|
||||
/* LiteralStringRegex */ .chroma .sr { color: #dc322f }
|
||||
/* LiteralStringSingle */ .chroma .s1 { color: #2aa198 }
|
||||
/* LiteralStringSymbol */ .chroma .ss { color: #2aa198 }
|
||||
/* LiteralNumber */ .chroma .m { color: #2aa198 }
|
||||
/* LiteralNumberBin */ .chroma .mb { color: #2aa198 }
|
||||
/* LiteralNumberFloat */ .chroma .mf { color: #2aa198 }
|
||||
/* LiteralNumberHex */ .chroma .mh { color: #2aa198 }
|
||||
/* LiteralNumberInteger */ .chroma .mi { color: #2aa198 }
|
||||
/* LiteralNumberIntegerLong */ .chroma .il { color: #2aa198 }
|
||||
/* LiteralNumberOct */ .chroma .mo { color: #2aa198 }
|
||||
/* Operator */ .chroma .o { color: #719e07 }
|
||||
/* OperatorWord */ .chroma .ow { color: #719e07 }
|
||||
/* Comment */ .chroma .c { color: #586e75 }
|
||||
/* CommentHashbang */ .chroma .ch { color: #586e75 }
|
||||
/* CommentMultiline */ .chroma .cm { color: #586e75 }
|
||||
/* CommentSingle */ .chroma .c1 { color: #586e75 }
|
||||
/* CommentSpecial */ .chroma .cs { color: #719e07 }
|
||||
/* CommentPreproc */ .chroma .cp { color: #719e07 }
|
||||
/* CommentPreprocFile */ .chroma .cpf { color: #719e07 }
|
||||
/* GenericDeleted */ .chroma .gd { color: #dc322f }
|
||||
/* GenericEmph */ .chroma .ge { font-style: italic }
|
||||
/* GenericError */ .chroma .gr { color: #dc322f; font-weight: bold }
|
||||
/* GenericHeading */ .chroma .gh { color: #cb4b16 }
|
||||
/* GenericInserted */ .chroma .gi { color: #719e07 }
|
||||
/* GenericStrong */ .chroma .gs { font-weight: bold }
|
||||
/* GenericSubheading */ .chroma .gu { color: #268bd2 }
|
|
@ -1 +0,0 @@
|
|||
hljs.initHighlightingOnLoad();
|
|
@ -13,9 +13,5 @@
|
|||
|
||||
</div>
|
||||
|
||||
<script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/8.4/highlight.min.js"></script>
|
||||
{{ $highlightInitJs := resources.Get "js/init.js" | resources.Minify | resources.Fingerprint }}
|
||||
<script src="{{ $highlightInitJs.Permalink }}" integrity="{{ $highlightInitJs.Data.Integrity }}"></script>
|
||||
|
||||
</body>
|
||||
</html>
|
||||
|
|
|
@ -6,14 +6,15 @@
|
|||
{{ $skeletonCss := resources.Get "css/skeleton.css" | resources.Minify | resources.Fingerprint }}
|
||||
{{ $customCss := resources.Get "css/custom.css" | resources.Minify | resources.Fingerprint }}
|
||||
{{ $normalizeCss := resources.Get "css/normalize.css" | resources.Minify | resources.Fingerprint }}
|
||||
{{ $syntaxCss := resources.Get "css/syntax.css" | resources.Minify | resources.Fingerprint }}
|
||||
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<link href="//fonts.googleapis.com/css?family=Roboto:400,700" rel="stylesheet" type="text/css">
|
||||
|
||||
<link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/highlight.js/8.4/styles/github.min.css">
|
||||
<link rel="stylesheet" href="{{ $normalizeCss.Permalink }}">
|
||||
<link rel="stylesheet" href="{{ $skeletonCss.Permalink }}">
|
||||
<link rel="stylesheet" href="{{ $customCss.Permalink }}">
|
||||
<link rel="stylesheet" href="{{ $syntaxCss.Permalink }}">
|
||||
|
||||
<link rel="alternate" href="/index.xml" type="application/rss+xml" title="{{ .Site.Title }}">
|
||||
|
||||
|
|
Loading…
Add table
Reference in a new issue