Friday, November 13, 2009

[How-to:] Merging Git Repositories

For a few months now, I have been slowly converting my subversion repositories into full fledged git repositories.  At first, I used git as a front end to various subversion repositories until I became convinced that git was stable and robust enough for my needs.

However, before the conversion I had to answer some philosophical questions about how I wanted my files organized.  Should I have one or more large repositories, or dozens of smaller repositories?   How should I organize the projects?  On a per client basis?  Categorically?  Topically?

I currently have several subversion repositories based on broad topical categories, like "business", "personal", or "archive."  Files were further categorized into more specialized categories until they drilled down to the final category.  As a benefit to this structure, I can checkout just the directories I need at the moment, make the changes, then delete my local checkout when I am finished.

I could certainly go forward with this approach, and clone the repositories directly into one or more gargantuan git repositories.  But is this this the best way?  In my opinion it isn't, so I decided to break apart the large subversion repositories into smaller, topically organized git repositories.

Further, from my experiments with working with very large repositories, when a git repository gets past 7GB, you start running into memory issues, primarily when cloning and packing the repository (if you are cloning an already packed repository, then you are fine).   In the future, as more operating systems cross the 64-bit 4GB memory limitation this will be less of an issue.  

Bring out your dead.

In GTD, you have a physical file labeled, "dead" where you put reference files that are no longer needed, but are just too important to throw away.  They are just dead space that await their fate.  At some interval, such as every year, you clean out the dead file and throw away or re-file.

Likewise, I created a dead.git repository that I stuff client files (time sheets, invoices, quotes, contracts, etc), that I don't need to actively reference anymore -- but are still useful.  Once a year I will merge them into my archive, which I lovely named tomb.git.

Get it? tomb houses dead bodies... I'm so clever.

Which brings me to merging.

I need to move the entire contents (with history) to another repository.

I have two repositories: 1) tomb.git and 2) source_archive.git.  I want to merge the contents of source_archive.git into tomb.git.

The process is relatively simple.  First, we fetch a copy of the master branch from the source archive repository into a newly created branch named "sourcemerge."  Issuing a 'git branch' shows that the new branch has been created:

   1: $ cd tomb.git
   2: $ git fetch ../source_archive.git master:sourcemerge
   3: From ../source-archive
   4:  * [new branch]      master     -> sourcemerge
   5: $ git branch
   6: * master
   7:   sourcemerge

Next, we checkout out the newly created branch:

   1: $ git checkout -f sourcemerge
   2: Checking out files: 100% (20975/20975), done.
   3: Switched to branch "sourcemerge"

Now, everything should be there from the old repository.  You can check with gitk to see the history has been pulled in.  Next, we have to jump back to the master and merge:

   1: $ git checkout -f master
   2: Checking out files: 100% (20975/20975), done.
   3: Switched to branch "master"
   4: $ git merge sourcemerge

See?  It is dead simple.  You can now delete the temporary branch.

Labels: , ,

Saturday, September 5, 2009

[Perl] Search and Replace Copyright in Source Code

Without getting into a debate over the legalities and application of copyright law, it was necessary to change the company name in the copyright statement in a large swath of code (the company had been purchased, then resold).

For the record, let me say that programmers, IT administrators and managers generally make poor lawyers.  This probably has something to do with the total lack of legal education and training.

This simple problem has a very elegant solution, which can be accomplished with a one line Perl program, executed from the command line.

As I put forth the solution, it was dismissed.

Over the next few weeks, the conversation turned.  Time was spent trying to develop an Ant script that would check and modify the code at build time.  Next, building an Eclipse plugin was investigated.  Finally, a developer had found an article on the Internet suggesting you should never, ever change the copyright text.

I never heard anything about it after that.

In any event, here is the solution.  The problem is that spread throughout thousands of files are these textual statements that need to be normalized:

   1: Copyright © 1983 Umbrella Corporation; All rights reserved.
   2:  
   3: Copyright 1923 Umbrella Corporation; All rights reserved.
   4:  
   5: Copyright 2050 Umbrella Inc.; All rights reserved.
   6:  
   7: Copyright 1234 Umbrella LLC  

The easiest way is to use Perl.  With Perl we can search and replace with a regular expression in a file.  For example, lets say I want to replace "hello" with "hello world" in myfile.java.  Easy:

   1: Perl –pi –w –e ‘s/hello/Hello world/g’ myfile.java

We can easily search and replace text in one file, we just need to build a regular expression that will match the copyright statements, and replace it the "correct" one.  Also, we use find to find all of the files we need to do the search and replace and use xargs to format them for the command line.

For example, to change the copyright year ("Copyright nnnn" to "Copyright 2009") you would use:

   1: perl -pi -w -e 's/Copyright \d+/Copyright 2009/g;' \
   2: `find -name "*.java" | xargs`

To change the company name:

   1: perl -pi -w -e 's/Umbrella LLC/Umbrella Corporation/g' \
   2: `find -name "*.java" | xargs`

Of course, these are just simple examples.  You will have to write a better regular expression and might have to take several stabs at it, but it isn't a three week issue. 

Labels: , , ,

Saturday, February 21, 2009

How to force qmake to stop generating an XCode project and generate a gcc makefile instead

Problem: Under OS X, qmake automatically generates an XCode project from a qmake project file.  How do I force it to make a standard gcc makefile?

Unless you have configured Qt when you build it, the default under Mac OS X is to generate an XCode project.  The way to generate a "standard" gcc makefile is to pass the "-spec macxg++" switches on the command line:

   1: qmake -spec macx-g++ 

Additionally, you should place the following into your project file in order to stop it from generating the app_bundle:

   1: mac { 
   2:   CONFIG -= app_bundle 
   3: }

Labels: , ,

Wednesday, December 17, 2008

A "Swiss-Army Knife" Eclipse Setup

As a consultant, I'm often forced to switch development environments based on the customer's preferences such as: Code Warrior, Eclipse, KDevelop, hand written makefiles, autoconf scripts, and qmake project files.  More often than not the alpha developer has championed a setup and toolset, making it a standard.  Sometimes this setup is documented, but more often than not, what is documented is often quickly falls out of date. 

This is how I setup my Eclipse environment to handle just about anything:

Step 1: Setup IDE for Java EE Developers

Download Eclipse IDE for Java EE Developers from eclipse.org.  This 162MB tarball has everything you need to develop JEE and Web applications and includes the IDE, tools for JEE and JSF, Mylyn (task list) and more.

Step 2: Setup C/C++ Development

Next, update to add C/C++ to add GNU C/C++ support to Eclipse.  Select  Help->Software Updates.  Select the C/C++ packages, install, and restart Eclipse. 

Step 3: Adding Python support to Eclipse

Python is one language that has one quark that drives me nuts -- tabs versus spaces.  If your editor inserts tabs, causing you to inadvertently mix tabs and spaces for indention, the python interpreter will often do some very strange things.  I have wasted a lot of time tracking down issues due to mixing tabs and spaces.  The best solution-- use PyDev.

Help -> Software Updates, Available Software.  Add Site: http://pydev.sourceforge.net/updates/.

Install and restart.

Step 4: Adding Perl Integration (EPIC)

Perl is a wonderful scripting language, but wouldn't it be nice to have an editor with syntax highlighting, on-the-fly-syntax checking, a debugger, global and local variable inspection and expression evaluation?  You can with EPIC.

Select Help > Software Updates... in Eclipse, add the update site http://e-p-i-c.sf.net/updates/ and follow the on-screen instructions.

Step 5: Web Tools Platform (WTP)

WTP is a suite of plug-is with tools for developing J2EE applications, and includes editors for HTML, Javascript, CSS, JSP, SQL, XML, DTD, XSD, and WSDL.

Select Help > Software Updates... in Eclipse, add the update site http://download.eclipse.org/webtools/updates/ and install.  Note the site might already be in your Eclipse setup, so you can click "Manage Sites..." and check the site, and then back to the available software tab.

Step 6: Make it Harder to Check in Code

Some organizations rely on "automated code review tools" or rules checkers.  I personally feel that if you need to rely heavily on those tools then you don't have the right personnel.

Step 6a Eclipse Checkstyle.  Help->Software Updates->Find and Install.... you know the drill by now : http://eclipse-cs.sourceforge.net/update

Restart Eclipse.

Step 6b PMD. PMD scans Java source code and looks for potential problems.  Help->Software Updates->Find and Install.... http://pmd.sf.net/eclipse

Step 7: Subclipse Subversion Plug-in

The Eclipse update site URL is: http://subclipse.tigris.org/update_1.4.x

Step 8: JSEclipse JavaScript Editing Plug-in

JSEclipse is a JavaScript editing plug-in from Adobe.  The update site URL is: http://download.macromedia.com/pub/labs/jseclipse/autoinstall/

Step 9: PHP and PHP Debugger

Install PHP.  Click here for the instructions.  Next, install the Zend debugger.

And there you have my current Eclipse setup. Ta-Dah. 

Labels: ,

Sunday, August 31, 2008

Microsoft and Google to introduce App Stores

Interesting developments on the mobile front... According to the CNet wireless blog, Microsoft is responding to the resounding success of Apple's App Store; Microsoft is expected to introduce similar service called SkyMarket this fall for mobile devices that run their operating systems.  The blog posting is HERE

Some other interesting quotes in the article are:

What remains to be seen is if Microsoft starts to put and emphasis on developers who develop for Microsoft pocket PCs and smart phones.

Labels: ,

Thursday, March 27, 2008

LD_LIBRARY_PATH in Mac OS X

I keep forgetting, so I thought I would drop a quick blog posting to remind me forever.  For those of you who are moving from Linux to Mac OS, you might be interested:

The equivalent of $LD_LIBRARY_PATH in Linux, is (drum roll please): DYLD_LIBRARY_PATH.

$ export DYLD_LIBRARY_PATH=`pwd`/mydebugpath

And now your application will resolve the shared objects (.dynlib) files at runtime.

For all the glory details you can also man dyld: 

$man dyld

Labels: , ,

Wednesday, January 16, 2008

git: Tralining whitespace error during commit

I'm using git more and more. However, when checking in code, I sometimes get an error about trailing whitespace and the commit fails.

To fix this:

in .git/hooks/pre-commit, delete the following lines:

if (/\s$/) {
bad_line("trailing whitespace", $_);
}




Mangled by ScribeFire and fixed by hand.

Labels: ,

Thursday, September 27, 2007

HTTP GET, POST, and PUT with curl

Curl is one of those swiss army knife type command line utilities for developers who need to develop web services or network software. I keep forgetting the command line syntax using curl, so I thought I would post it here in its artful simplicity.

HTTP GET to a URL:

curl http://www.domain.com/index.html

HTTP PUT a file to a URL:


curl -X -d @mytestfile.xml http://www.domain.com/script.php

HTTP POST a string to a URL:

curl -d "String to post" http://www.domain.com/script.php

HTTP POST a file to a URL:

curl -d @file_to_post.xml http://www.domain.com/script.php

Enjoy.

Mis-posted by ScribeFire and corrected by hand.

Labels:

Thursday, September 6, 2007

How Not to Implement Serialization in C++

I spent most of the day knocking out a nice PowerPoint slide deck to walk a customer's developers through a large swath of code I just checked into their subversion repository.  I hammered out a framework that would improve productivity and implementation.  With a littel luck, the project would be back on schedule. 
 
The framework included serialization, complements of the Boost serialization library. However, the principle architect balked at using Boost serialization -- I should have used the persistence mechanism that was coded by their developers.  I argued, but when the person who signs my timesheets agreed with the architect, the battle was over. The customer is always right. 
 
With that, I needed to revamp my code to use the persistence mechanism (which used blocks of memory that were flushed to disk or flash memory), but make it more usable.  Hundreds of lines of, "if (p) p->write(sizeof(x), &x, 1)" are not acceptable to me.  There has to be a better, more developer friendly way. 
 
I immediately turned to Google for help.  Surely a developer out there had a simular interest in rolling their own serialization scheme.  Amazingly, I found several "tutorials" on serialization that had the exact same theme.
 
The number one serialization tutorial was the "<ahref="http://www.functionx.com/cpp/articles/serialization.htm" functionx C++ Object Serialization tutorial</a>. It should be entitled, "How NOT to Implement Object Serialization." 
 
Take the following example:
#include <fstream>
#include <iostream>
using namespace std;

class Student
{
public:
    char   FullName[40];
    char   CompleteAddress[120];
    char   Gender;
    double Age;
    bool   LivesInASingleParentHome;
};

int main()
{
 Student one;

  strcpy(one.FullName, "Ernestine Waller");
  strcpy(one.CompleteAddress, "824 Larson Drv, Silver Spring, MD 20910");
  one.Gender = 'F';
  one.Age = 16.50;
  one.LivesInASingleParentHome = true;

  ofstream ofs("fifthgrade.ros", ios::binary);
  ofs.write((char *)&one, sizeof(one));

  return 0;
}
So what is wrong with doing this?  It is an extremely poor solution that will only work on the same compiler/platform reliably.
 
Also, the pointer to the class one doesn't necessarily point to the first data item, public or private within this class.  I spent several days tracking down and squashing a bug in an embedded system because of assumptions like that.  In my case, the original developer cast the class to a char *, then wrote the class to a socket.  Unfortunately for the developer, who assumed that the pointer to the class would point to the private data, he did not consider that this might be compiler dependent.  After looking at network traces I quickly figured out the problem.  One compiler was injecting some extra bytes, causing problems.
 
Next, what happens when you save a class that contains pointers to other classes?  Kaboom.
 
 

Labels: ,

Free Dr. Dobbs Journal and MSDN Magazines

Developers can register to subscribe to digital editions of Microsoft MSDN Magazine and Dr. Dobb's Journal for free. Simply login HERE using your Windows Live ID (formerly called passport) account and subscribe.
  • A sample of Dr. Dobb's Magazine can be found HERE

  • A Sample of MSDN Magazine can be found HERE.



Powered by ScribeFire.

Labels: ,

Wednesday, September 5, 2007

Dealing with C++ "Unused Parameter" Warnings

Generally, when I am working with clients I try to get everyone on the team to agree to a "no compiler warnings" policy and immediately turn on the "warnings as errors" switch. Thereafter, any compiler warning is automatically treated like an error and the build will fail. Unfortunately, when stepping into a project where most of the code was written long ago, there may be hundreds if not thousands of compiler warnings. The only solution is to investigate and fix (or suppress) each one. So how do you suppress the unused parameter warnings when you legitimately don't use the argument? There are three ways: 1. #pragma unused. #pragmas are compiler specific and should be avoided. However, if your compiler supports it, you can use it as follows:

void my_function(int32 foo)
{
#pragma unused foo;
}

2. Comment the Argument. Additionally, you can comment out the argument. The compiler will not give you an unused warning:

void myfunction(int /* arg */)
{
}

3. Cast to void. Casting an unused variable to void will always
stop the warning.

#define UNUSED_ARGUMENT(x) (void)x
void myfunction(int arg )
{
UNUSED_ARGUMENT(arg);
}

Labels: ,