Thursday, November 22, 2007

latex note: minipage with borders

One way to add borders to a minipage is embedding it into an fbox. For example:


Friday, November 09, 2007

C++ note: initializers in multiple inheritance

Here is a tutorial on C++ multiple inheritance.

A C++ class can inherit multiple base classes. If these base class still have base classes, they will be initialized separately by default. (For each parent class, the subclass will contain the inherited structures) For example, suppose B and C derive from A, and D derive from B and C, D will have two copies of A. One inherited from B, and the other from C.

In some occasions only one copy of A is needed in D. This type of inheritance is enabled by letting B and C virtually derive from A. By virtual inheritance, B and C no longer encompass the structure of A into their structure, but include a pointer to the A structure in their vtables. When D derive from B and C, the compiler generates only one A structure that is shared by B and C.

Here comes a question: in the constructor of D, what structures do we need to initialize?

class A {
   int a;
  A ( int i ) : a (i) {}

class B : virtual public A {
   B ( int i ) : A(i) {}

class C : virtual public A {
   C ( int i ) : A(i) {}

class D : public B, public C {
   D ( int i ) : A(i), B(i), C(i) {} // <-------------

In the above example, D has to initilize A, B and C in its constructor. The initialization of A is mandatory even though both B and C do have the initializers in their constructor. When D is constructed, the initializers of B and C will not assign any value to A by default. This fact can be observed by removing the initialization of A in D, and printing the a field from D-object.

Thursday, November 08, 2007

C++ note: avoid allocating large structures in the stack

Unexpected segmentation fault when calling a function can be caused by big memory allocation in the stack. The following piece of code will fail.

class Big {
   int data[10000000000];

void test() {
   cout << "entering test";
   Big big;

void main(int argc, char** argv) {

The reason is this: when test() is called, the variables declared in this function are put to the stack ( which is a special area of memory for function calls, commonly supported by a stack pointer in the CPU architecture ). The size of the stack is limited, and putting too much data in it can cause unexpected errors without warning. ( Note that compilers may choose to push stack before any executions, and therefore the "entering test" hint might not be printed. ) This kind of error is hard to trace during debugging. With new CPU architectures, the stack size may be improved, but this trap is still commonly seen nowadays.

To void such segmentation faults, we need to put the big structures into the heap. Two methods can be used. One is allocating memory inside the class:

class Big {
   int *data;
   Big() { data = new int[10000000000]; }
   virtual ~Big() { delete data; }

The other is allocating memory inside the function instead:

void test() {
   cout << "entering test";
   Big *big = new Big;
   delete big;

Wednesday, November 07, 2007

C++ note: pointer and reference

The semantics of the * operator is to get the value from a pointer. When it is used as a reference, *p converts the pointer to the reference of the object it points to. This is illustrated with the following example.

void reference( int &i ) { i = 4; }

void testReference() {
   int i=5;
   int *p = &i;
   reference( *p );
   cout < < i << endl;

As can be seen from the output (4), the i in the testReference method is actually modified. This shows the semantics of the * operator, when used as a reference.

Friday, November 02, 2007

C++ note: predefined macros

The standard C++ compilers all support the following macros:

__DATE__ - a string literal containing the date when the source file was compiled, in the form of "Mmm dd yyyy".

__FILE__ - a string literal containing the name of the source file.

__LINE__ - an integer representing the current source line number.

__STDC__ - defined if the C compiler conforms to the ANSI standard. This macro is undefined if the language level is set to anything other than ANSI.

__TIME__ - a string literal containing the time when the source file was compiled, in the form of "hh:mm:ss".

__cplusplus - defined when the compiler supports c plus plus.

These macros can be used anywhere in the source code, just as if they were defined by the #define directive. Their value changes according the the specific file and line in the source code. Therefore, we can use them to implement error reporting that includes file and line numbers in C++ source code.

#define REPORT(x) cerr << endl << "In " << __FILE__ << ", line " << __LINE__ << ": " << endl << x << endl; cerr.flush();

Monday, October 29, 2007

C++ note: operator precedence

Another common trap is the precedence between == and & | ^

The expression a & b == 0 is interpreted as a & (b==0) instead of (a & b) == 0, quite different from a + b == 0.

Conclusion: always add brackets when evaluating the bitwise operators & | ^

Monday, October 01, 2007

Compile log: SRILM

SRILM is a language modelling package, used by the moses translation system. The compilation is quite straightforward, except two things:

1. The current path in the Makefile needs to be changed
2. In common/, there are paths to gcc and g++.
In the first line, GCC_FLAGS might need editing as well. For example, -mtune=pentium3.

Sunday, September 16, 2007

vim note: grep

The handy tool grep from Linux can also be used in VIM, by just typing ":grep PATTERN FILES" in the command mode. It finds the string according to the input pattern from all designated files. Note that FILES needs to be in the format of absolute path. Wildcard, such as "c:\\files\\*.txt" can be used. In version 7.0, "\\" is required for the path splitter under windows. "/" does not seem to work. Regular expression could be used for PATTERN.

By default, only the first search result is shown. Use ":cn" to navigate to the next search result, and use ":cp" to navigate to the previous result. The navigation can jump from one file to another, of course. When multiple buffers (or files) are opened, use ":bn" and ":bp" to jump from one file to the other. Use ":bd" to remove a file from buffers.

Saturday, September 15, 2007

mod_python note: sessions

mod_python has its own modules for Session and Cookies. The mod_python documentation contains consise explanations about the usage. Cookies are used for session maintenance. An introduction can be found here.
Python supports cookies by a built in module, which is probably developed for CGI programming before adopted into Python core. mod_python can use this module as well for cookies.

Python note: using the windows clipboard

The windows clipboard allows HTML content to be processed. A specification about the specific format can be found here.

A python support for the windows clipboard is found in this site package.

A recipe for using the clipboard, written by Phillip Piper, can be found from this web page. I gave a simplified version of putting something into the clipboard at here.

Sunday, September 09, 2007

Apach note: mod_python global objects

The term "global objects" in this article means the objects that exists from the starting of the web server until the shutdown. An example of such global objects is a database proxy, which is initialized at server start, and handles database calls during all the serving session.

At the first look, it appeared to be no place for defining global objects, for mod_python runs on the per-request basis, with each request being mapped to a call.

However, mod_python has the advantage over CGI that a python interpreter is created for one virtual server to handle all requests to server. The interpreter starts when the server starts, and last until the server is shut down. Here is a reference for the multiple interpreter mechanism for the mod_python module.

To take this advantage, global variables can be place in the global namespace for the Python interpreter. For example, it could be intitilized from a module's namespace.

Apache note: set up a mod_python server

Though the building of mod_python can be a little daunting under UNIX, it's quite easy to start a mod_python server with apache under the windows platform.

Suppose that a machine has python installed.

First download the apache http server from the apache web site. (you need to find the download site by following the links from the main site) Current version is 2.2.

Second download the mod_python binaries for windows from the mod_python web site. Current version is 3.3

Third install apache following the instructions.

Fourth install mod_python following the instructions. The information in the last page is important. The LoadModule command must be added to the httpd.conf file for apache so that the http server recognizes mod_python.

The installation test can be found the documentation Note that*.py will work. This is because the request url is handed to as parameters.

Latex note: how to shrink tables

To reduce the size of a table, two general methods can be used.

1. Shrink the the font size.
For example, use \small{content} instead of content in the table cell.

2. Squeeze space between columns.
For example, a table could be written in the following way

Saturday, August 25, 2007

C++ note: unsigned int

Suppose a is an unsigned int, the expression a>-1 is always false. This is because the type of -1 is decided by the type of a, and it becomes maxint.

Hint: remember the unsigned int variables in the code and do not compare them with a minus number. The comparison itself is not necessary.

Wednesday, August 01, 2007

Linux note: how to disable output

One way is adding a redirection to the command to execute:

command >/dev/null

In the above line, command is executed with its output redirected to /dev/null.

More notes about the above syntax. ">" is the operator to direct the output of command to a file. In this case, the file is the special device /dev/null. The standard output (using ">") and the error output ("2>") can both be redirected. To redirect the standard output and the error output at the same time, use ("&>"), and to append the redirected output to a file, use (">>").

Disabling output is necessary in many situations. For example, when using nohup, the system output is put into the file nohup.out. If the disk space is limited, we may want this file to be as small as possible. Therefore, we may put necessary information into the error output, and redirect the standard output to null.

Saturday, June 16, 2007

VIM note: viewport

It's sometimes necessary to split the view of a file into many windows. Below is a cheatsheet copied from here.

:sp will split the Vim window horizontally. Can be written out entirely as :split .
:vsp will split the Vim window vertically. Can be written out as :vsplit .
Ctrl-w Ctrl-w moves between Vim viewports.
Ctrl-w j moves one viewport down.
Ctrl-w k moves one viewport up.
Ctrl-w h moves one viewport to the left.
Ctrl-w l moves one viewport to the right.
Ctrl-w = tells Vim to resize viewports to be of equal size.
Ctrl-w - reduce active viewport by one line.
Ctrl-w + increase active viewport by one line.
Ctrl-w q will close the active window.
Ctrl-w r will rotate windows to the right.
Ctrl-w R will rotate windows to the left.

Tuesday, June 12, 2007

C++ note: mix with C code

The easiest way to make use of C modules is including the header files with

extern "C"
#include <a_c_module.h>

Then the C++ compiler knows that the header defines the structures in the c way, and will compile references in the C++ module in a compatible way.

For the standard libraries name "cblabla", extern "C" is already included in the header files. This is the reason why modules etc can be included directly.

For the linker, different vendors might have different standards to mangle the refernece names. Therefore, it needs to be confirmed that the C modules are compiled in a compatible way as the C++ module.

Making use of C++ modules from C modules is harder than the other way around. For the C++ programmers, it might be possible to turn some necessary C files into the C++ format and then compile the whole project as C++. Many files can be kept as the C modules and linked explicitly.

Monday, June 04, 2007

Javascript note: about the regular expressions

Javascript supports regular expressions as a built in language feature. Regular expressions are enclosed between two "/".

Here is a tutorial.

Tuesday, May 29, 2007

Python note: python eggs

"Eggs to Python are like Jars to Java..."

More information

Thursday, May 24, 2007

C++ note: static variables

C++ is different from Java and csharp in the way classes are defined. The header files in C++ are not actually compiled into the object - they are only an indicator of how the data structures are organized, i.e. the reference to a member is the base address plus how much offside. All variables and functions to appear inside objects must be defined in the cpp files. The methods in header files are treated as inline methods.

Now static variables are those that occur in the heap. They exist from the beginning until the finishing of the running of program. All variables defined in a cpp file are static variables, they are by default external linked. If a variable defined in a cpp file has explicit "static" modifier, it is internal linked and can only be accessed by the module. Static variables can also be defined inside functions and methods, in which case they are still present all through the running of a program; however, they can only be accessed from the name scope in which they are defined. The accessibility is the only difference between static variables. Lastly, constants in header files are also taken as static variables. This is why values can be assigned to them in the header files.

Tuesday, May 22, 2007

g++ note: don't forget -c option

I made a mistake when writing Makefile today, by missing to put -c for the compile-only step. It led to a lot of unresolved external references.

Monday, May 14, 2007

Latex note: generating letter size papers

latex paper
dvips -t letterSize -Ppdf -G0 paper
ps2pdf -sPAPERSIZE=letter

If the last step doesn't work, try the following command:

It should work because the ps file already contains size information.

Thursday, April 12, 2007

Latex note: more on making EPS graphs from MS Office

Please see my previous article about how to generate an eps chart from from MS Excel.

You can also insert Word pictures into eps. The mechanism is the same as Excel. First, make an empty word document or page to draw the vector graphs. Then print the page, using general eps printer. Lastly, modify the page bounding box of the eps file so that it fits the graph.

There is one thing to notice though: the coordinate counts from bottom left to top right! If the four numbers represent top left and bottom right, the picture will still show in latex document, but it will be sqeezed together with text because the size is below zero!

Latex note: use minipage to align a figure of text to center, while keeping text aligned to left

Sometimes we need to align text with multiple lines to the center of page.

this is first line \\
\hspace*{0.5cm} this is second line \\
this is third line \\
this is fourth line

The above may not work because all lines are aligned to center, and they look messy on the left border. The solution is using a minipage to align text inside figure.

this is first line \\
\hspace*{0.5cm} this is second line \\
this is third line \\
this is fourth line

Wednesday, April 11, 2007

Latex note: protect citations in caption

\caption{xxx \cite{yyy}} won't work, it won't compile with latex sometimes.

use \caption{xxx {\protect \cite{yyy}}} instead.

Saturday, March 31, 2007

Python note: os.path.join unexpected result for paths beginning with "/" or "\"

Need to pay attention when joining paths beginning with "/" (or "\" for windows).

os.path.join("c:\abc", "def", "ghi") -> no problem, c:\abc\def\ghi

os.path.join("c:\abc", "\def") -> outputs "\def".

Friday, March 23, 2007

C++ note: be careful with unsigned

I just made a mistake with an unsigned integer. It took me nearly an hour to debug, just to find that the subtle bug is caused by this:

if (a<b-2) ...

a = 0 b = 1

b is an unsigned integer.

The reason is that b-2 is interpreted as unsigned integer, and therefore the predicate wrong. The conclusion is that we should avoid using many minuses for unsigned.

if (a+2<b) is the solution for this problem

Friday, January 12, 2007

Latex note: convert Excel charts to eps files

This tip applies to any chart from MS Office.

In short, in order to insert charts from Excel to your Latex document, you need to (1) print your chart from MS Office into an eps file; (2) modify the bounding box in the output.

You need to install Generic Postscript Printer from Adobe first. Link the printer to the port FILE: . After installation, set its output option from Property -> Printing preferences... -> Advanced -> Postscript options -> Postscript output opion to encapsulated postscript format (eps).

This printer can be used to print the chart to an eps file.

The EPS file contains borders which defines where the chart is. Use postscript to open this EPS file, then use a text editor to open it simultaneously. Tick options -> eps clip from postscript. Then edit the line %%BoundingBox: ... from the text editor. Adjust the size of the chart in eps so that the clip fits well to the chart. Then save EPS.

Now you are done. Insert the chart into latex!