Monday, July 23, 2012

Misc links

Great applications for police UAVs

How To Survive A Robot Uprising

First German Russian Conference on Robotics

A nice article on soldering tools.

I can't use my AR Drone when inside VirtualBox, even after reading this howto.

Jamming elephant trunk -- awesome.

Autonomous flying indoors (ouch), with an on-board lidar and outdoors using PTAM (earlier version here).

Speed limit for birds (and drones).

Interesting thoughts on robotics.

Info on inductive learning.

Literate programming review; PyLit looks interesting for code to rst capabilities.

Deleting Mercurial revisions

A Mercurial repository I use for my research group got several large movies added to it, so that several machines could no longer pull. I removed them; my notes on the process:



  1. hg convert --config convert.hg.saverev=False robotics_research_big robotics_research --filemap no_ogv, where no_ogv follows this format, per ideas here.
  2. In Bitbucket, strip to revision 0 to empty out the repository (the strip trick).

Monday, December 5, 2011

Collaborative bibliograhies

How can multiple authors share a set of bibliographic entries, referencing them as necessary during the creation of a single, shared document?
  • BibTeX / LaTeX -- either the only choice for the faithful or an unacceptable morass for the unwashed. I consider it morally wrong for a word processor to generate syntax errors, so I fall in the second group.
  • EndNote -- the mainstream, commercial solution. Sharing requires creativity, as documented below.
  • Zotero, WizFolio, etc. -- I have no experience and feel that these products are a bit too young to inflict on a large team, members of which may not love shiny new applications as much as I.
So, that leaves EndNote. I have a love/hate relationship -- in spite of regular upgrades, many of annoyances in the first version I used persist into X5, some of which I discuss today. For EndNote sharing:
  • number of problems make their EndNote Web application a poor choice.
  • As in the first version I used, the main program permits only single-user access. Sorry, no Dropbox, no network share, no Windows Offline, no, no no. What year are we in?
So...the appropriate approach is to:
  1. Create a new Google Doc (or some other document which allows multiple users to edit it simultaneously).
  2. Each author will then enter references to be cited into their own EndNote library.
  3. Each user can then cite this entered reference by first selecting the reference to cite in EndNote then pressing Ctrl+c (copy). In Google Docs, press paste. This will generate a temporary citation of the form {First author's last name, year #num}. While this works in most cases, EndNote doesn't use the record number (the #num entry) to disambiguate entries with the same last name and year of publication, leaving you somewhat confused as to what {Smith, 1988 #23} really meant when you first cited it. Therefore, manually add one other field of your choice (such as the title) to the temporary citation before the record number, giving {First author's last name, year title #num}. There are many other nifty temporary citation options.
With that in place, one person (the coordinator) will then produce a final Word document with citations and a bibliography. To so do:
  1. All collaborators should save their EndNote library as a compressed library in a shared location (e.g. Dropbox).
  2. The coordinator should then copy everyone else's compressed library to a separate, non-shared directory, then open all these libraries in EndNote.
  3. Now, the coordinator simply opens the shared document in Word and clicks on Endnote | Update references.

Friday, November 4, 2011

Bibliographies and collaboration

I recently finished writing a proposal with a collaborator. We both used EndNote X5 and EndNote Web to share our bibliographies. Problems I found:
  1. Upload from desktop to web is slow. Since updating in-text citations depends on the web, this making checking newly entered citations or fixing broken citations painful.
  2. The web version discards trailing spaces, while the desktop version does not. This making hacking an entry to make it pretty work in preview but not in the final document.
  3. Deleting a reference on the desktop doesn't delete the corresponding reference on the web. The reverse is probably also true, but untested.
  4. A Word doc builds a traveling library, which a simple "update citations" does not update. Instead, the best approach is to unformat all citations, quit Word, restart, then reformat. However, this can lead to a number of ambiguous citations. In particular, EndNote will perform a case-insensitive match against any string in any entry; it also gets lost if there are duplicate entries.
My final conclusion is that the EndNote Web process is a poor one.

We also chose to collaborate using Google Docs, which worked very well. EndNote usage was another matter. In particular, the temporary citation process: ctrl+c an entry in EndNote desktop and ctrl-v to another application (Google Docs) produces a citation in the form {Author's last name, year #num}. However, EndNote doesn't trust its #num as a unique identifier and will find anything with a matching last name / year entry. So, using this will cause problems, since the text (last name, year, obscure number) doesn't help humans identify the paper either. In fact, EndNote will match any {Author's last name, field} formatted item.

Therefore, a much better approach is to pick a more unique field, such as the title: {Author's last name, title}. If this isn't unique, then a {last name, label} where the label is a unique string. Of course, to check uniqueness requires a update citations, which is slow for EndNote Web. The full process is then: look at an entry to cite, manually create a {Author's last name, title} entry, download the paper, update citations, make sure all citations auto-resolve, then fix any problems.

So, what's a better process? Some ideas:
  1. Continue to use EndNote Web:
    1. Improve speed: upload only citations, not attachments, to EndNote Web. Create a new library for each new paper and only upload from that smaller library.
    2. Periodically delete all refs in EndNote Web then reupload to remove any deletions made on the desktop.
    3. Deal with any eaten trailing spaces by sighing in frustration.
  2. Move to EndNote on the PC.
    1. You can't share a file with a collaborator over the network / Dropbox / whatever.
    2. The alternative: get a copy or compressed copy of your collaborator's library and use that.
  3. Try another program. Both Zotero and WizFolio seem to be better at these things.
    1. Zotero supports temporary citations much like EndNote.
    2. WizFolio supports an older version of Google Docs, but not a temporary citation workaround I can see.

Thursday, July 14, 2011

Pygments to the rescue

I'm finally realized that Pygments is the Answer. I can simply run my code through it and get the output I'm looking for with only a little manual tweaking (see below); I'm sure this can be integrated into Pygments with a bit of hacking. Then, a simple copy and paste retrieves the original source code! Some additional tweaks:
  1. I'm like to hide the comment character (# in Python, // in C) to make it look nicer, by giving it a very small font. I suspect I could edit the output of the lexer, or perhaps play with the formatter, to do this.
  2. I need to disable HTML escaping for comments (this is done in the formatter). Perhaps a quick check for non-HTML < characters and escaping only these would help.
  3. I rely on the code editor to support line wrapping, which isn't always present. I don't see an obvious work-around for this. 


My simple tweak to the HTML style from Pygments to make comments look nice:
body .c { color: #408080; font-family: Sans-serif; white-space: normal; font-size: 90% } /* Comment */

Documentation and more

Documentation

I continue to work on my documentation idea. So far, I've run into two problems. First, I'll need my code to HTML converter to recognize strings so that doesn't mistake comment characters in a string as a true comment: printf("// not a comment"). Second, the problem of initial spaces: how can I properly translate them? If a source file indents a comment by 4 spaces then the following line of code by 4 spaces, it looks fine. However, in HTML the code font and comment font will be different, so those 4 spaces cause things to look ugly either in HTML or (if the spacing works in HTML) in the code.

I can think of a couple ideas:
  1. Auto-space -- in code, indent a comment line to match the spacing of the next code line. This would work most of the time. In HTML, do the same; make the space characters the same as the code font to insure alignment.
  2. In HTML, always insert the same number of spaces as the code, in the code font. How would I detect these spaces and automatically remove them when going the other direction? Perhaps tagging the initial spaces as .
For correctly recognizing strings, I'll need some sort of lexer. Yuck. One option is to start with the Python tokenizer; its' source code (see link on that page) contains all the necessary regular expressions. Another is to use Pygments, which I'll want for syntax highlighting anyway. I may hack around this for now just to get some working code, then return to fix it. A related but simpler problem is dealing with C /* */ comments.

In working with the idea, I'm continually surprised by how much just writing about the problem has helped me to solve it. I believe that this will be a big help for me in all future projects, if I can actually find the time to implement it.

Other things
Do robots take people's jobs? No, they empower people and create jobs, as the arguments in this article show. It included some nice historical perspective and facts (did word processors eliminate secretaries?).

I found a list of the top 25 most dangerous software errors (from a security perspective). Interesting.

Friday, July 8, 2011

After playing with my documentation idea a bit, I discovered a serious problem: what I created looked great in Word, but all the comments existed only in Word, leaving me a bit lost when I looked at the source code itself. This is a problem; for many, the source code will be the first thing they see and the only thing they see. In many applications (fixing compiler errors, debugging) I'll be working with the source code. The moral of the story: the source code matters!

In particular, Knuth composed in a Web file, then produced both a .tex and a .pas (Pascal source file, whatever the extension was). However, neither produced file was editable or even very human-friendly. Instead, I now see that both the "pretty" format (in Word or whatever) and the "plain" format (raw source code) should both be nicely formatted and easily readable.

That is, I'm building a bridge between a beautiful representation of the code (probably HTML) and a functional representation of the code (as plain text). The beautiful form is easier to edit documentation and comments, insert diagrams, videos, etc. while the functional form provides a tight coupling with the compiler / debugger.

That changes everything in terms of my design.

Before I get too carried away with it, let me test-drive this idea by providing some example code. This is the beautiful form, taken from a unit-testing section of the document.


Testing
I don’t have a unit testing framework. So, I’ll develop what’s necessary as I go. There’s a framework for Excel, but it’s very tied to that application. Problems so far with the home-brew approach:
1. There’s no automatic test discovery; I have to manually add all tests.
2. There’s no setup()/teardown() facility
3. There’s no “clean the environment” comment. For example, strLastError can be polluted by earlier tests.

Source file split testing
A documentation file with no extension should produce an error.
Sub Test_SourceWithNoExtension()

First, create a dummy document to test with.
    Dim docSource As Document
    Set docSource = Documents.Add
    Dim strFileName As String
    strFileName = "Word documentation idea test."
    docSource.SaveAs fileName:=strFileName

Now, do our testing.
    OpenDocFile

Clean up by closing and erasing this old doc. If the test breaks, the developer must close it ma-nually. Time to look for a try/catch in VBA (On Error statement)
    docSource.Close
    Kill strFileName

Check that it worked. Errors are reported as strings, so check for the correct error text.
    Assert (strLastError Like "*Documentation file has no extension:*")
End Sub


Now, here's how I'd like to see this in the functional form (as source code). Since this is VBA, the comment character is the single quote '.
' <h1>Testing</h1>
'
<p>I don’t have a unit testing framework. So, I’ll
develop what’s  necessary as I go. There’s a framework
' for Excel, but it’s very tied to that application.
' Problems so far with the home-brew approach:</p>
<ul>
'   <li>There’s no automatic test discovery; I have to
'     manually add all tests.</li>
'   <li>There’s no setup()/teardown() facility</li>
'   <li>There’s no “clean the environment” comment.
'     For example, strLastError can be polluted by 
'     earlier tests.</li>
</ul>
'
<h2>Source file split testing</h2>
<p>A documentation file with no extension should
' produce an error.</p>
Sub Test_SourceWithNoExtension()

    ' First, create a dummy document to test with.</p>
    Dim docSource As Document
    Set docSource = Documents.Add
    Dim strFileName As String
    strFileName = "Word documentation idea test."
    docSource.SaveAs fileName:=strFileName

    ' <p>Now, do our testing.</p>
    OpenDocFile

    ' <p>Clean up by closing and erasing this old doc.
    ' If the test breaks, the developer must close it
    ' manually. Time to look for a try/catch in VBA
    ' (On Error statement)</p>
    docSource.Close
    Kill strFileName

    ' <p>Check that it worked. Errors are reported as strings, so check for the correct error text.</p>
    Assert (strLastError Like "*Documentation file has no extension:*")
End Sub

It's interesting that, to me, reading the first is much easier than reading the second. Not because of the HTML markup, but because a simple difference in font provides visual cues to divide the code nicely. It feels good to me to read the first! This is certainly what I'm striving for.

I haven't found a reasonably-featured word processor that read and writes HTML, though. Word includes lots of goop, but has all the features I want. I need to try OpenOffice and also Compser to see if they're reasonable. While I like several of the browser-based editors (Google Sites / Docs is great), the "allow now access to local files" paradigm seems to prevent their use in editing local files.