This document contains short and informal guidelines and notes on various research tools and coding skills based on my personal experience, prepared mainly for new graduate students. These are not complete tutorials, or carefully prepared manuals, so read and utilize at your own risk. I do not endorse any product, website, tool or company. Happy research coding!
Last update: April 2019
Use an online git server for convenience:
A very short tutorial on git (please look at online tutorials/stackoverflow to understand the details and find answers to your questions):
git add "file"
Add a file to your repository. git add .
Add everything under the current folder recursively to your repository. git commit
Take a local backup of changes that you have made (collaborators cannot access these changes that you have made). Feel free to commit as frequently as you want, these are just backups. git push
Upload all your recent commits to the server. When collaborators do git pull
, they will receive these updates. git pull
Receive updates from your collaborators alias gitsync='(git add . && git-commit-silent && git pull && git push)'
to quickly do the three steps above, where alias git-commit-silent='git diff-index --quiet HEAD || git commit'
.gitignore
to avoid commiting large data files to your repository.
What you want to commit depends on your project. For example, you definitely do not want to commit dataset
images and model files that can be downloaded from the web, or regenerated using your code. Instead, you can provide
scripts to download training images, pretrained models, etc. from a particular server. (In my own code,
I keep such files in a remote folder and make the root data folder a parameter instead). But if it
is the latex source for a paper, you should commit images, pdfs, etc that are needed to compile the
latex source. git checkout "hash"
will give you the version in a detached state git switch -
will revert back to the up-to-date state. Partial-color-blindness friendly color palette
The explicit color names/values (adapted from source)
Palette = [
('006BA4','Cerulean/Blue'),
('FF800E','Pumpkin/Orange'),
('ABABAB','Dark Gray/Gray'),
('595959','Mortar/Grey'),
('5F9ED1','Picton Blue/Blue'),
('C85200','Tenne (Tawny)/Orange'),
('898989','Suva Grey/Grey'),
('A2C8EC','Sail/Blue'),
('FFBC79','Macaroni And Cheese/Orange'),
('CFCFCF','Very Light Grey/Grey')
]
import matplotlib.pyplot as plt
import numpy as np
th = np.linspace(0, 2*np.pi, 128)
fig, ax = plt.subplots(figsize=(3, 3))
for j in range(len(Palette)):
ax.plot(th, np.cos(th)-j, color='#'+Palette[j][0], label=f'C{j}')
ax.legend()
We use Slack, Dropbox Paper, Zoom and Google Docs (and sometimes other similar tools) heavily for coordinating research and taking logs in our group. Some suggestions for Slack:
When using online collaborative tex editors, such as overleaf.com: git-clone the repository on your local machine, cache your git password, and, take frequent back-ups via watch -n 600 git pull
, in case the website goes down.
There are several very good libraries for this purpose. We prefer PyTorch but TF is also fine.
In some specific cases, a good starting point might be a high-quality public source code relevant to your project.
You should progressively become comfortable with using the linux development environment in the terminal.
vim
or emacs
very well. The time you spend to learn these editors pays off over time. Makefile
or Bazel
expert. But do want to understand what they do, and why people use them for compilation-heavy projects. ./config --prefix
option to target compilation into an arbitrary folder. Most libraries (including gcc compiler) can be installed locally. gcc -Wl,-rpath -Wl,"libpath"
to hard-code the linking path g++ -Wl,-rpath -Wl,"libpath"
to hard-code the linking path $LD_LIBRARY_PATH
to prioritize your library directories at run time What OS you use in your laptop is not that important. Linux distros already come with the core utilities that you need. Mac/OS X is unix-based, so it already has native terimal support, and you can install an X11 server, if you need. You can also install various linux utilies via homebrew. On windows, use cygwin or Windows subsystem for linux to have a linux-like environment.
A brief guideline for managing multiple versions of the code, while running experiments at the same time:
src_newidea
. git switch --create new_awesome_idea
(run inside the src_newidea
folder) git switch new_awesome_idea
. src_newidea
git checkout master
git merge new_awesome_idea
, and, git branch -d new_awesome_idea
git branch -a
. You can keep track of the experimental results by tagging (naming) the log files and your own notes using the date and git hash of the code that executes for each one of my experiments.
It is important to understand how to use a remote server via ssh in a fluid way. There are several tools and tricks that can make your life easier:
screen
to create sessions that do not terminate when you disconnect. In this manner, for instance, you can keep your code running for several days. alias newscreen='/usr/bin/screen -S'
# create a new screen session via newscreen NewScreenName
alias screen='/usr/bin/screen -dr'
# attach to an existing screen session via screen ExistingScreenName
.screenrc
here. git
is a must for backups and collaboration, but its commit/push/pull cycle is not appropriate for this purpose. mosh can also be handy. public_html
is sufficient to inspect your code and its graphical outputs. $DISPLAY
, using which you can redirect the output of a python/lua/matlab interpreter. $DISPLAY
. \IfFileExists{darkmode.tex}{ \usepackage{pagecolor} \definecolor{myfg}{gray}{0.94} \definecolor{mybg}{gray}{0} \input{darkmode.tex} % myfg and mybg can be altered in darkmode.tex using definecolor. \pagecolor{mybg} \color{myfg} }{}
darkmode.tex
to .gitignore darkmode.tex
to locally create darkmode rendering without affecting the human collaborators.