I put my first Erlang program in a pastebin. It's a concurrent prime sieve. Likely not the most efficient way to do things, but I'm still all pleased with myself. :-)
I may or may not choose to program more sophisticated things in Erlang, but I figured a passing familiarity was in order. Especially since I'm thinking of using CouchDB for something and it's written in Erlang. While knowing Erlang isn't necessary to understand CouchDB, I figure that it certainly can't hurt.
In many books about the singularity, the idea comes up of having your thought processes run on some interesting and imaginative substrate. Say, as an emergent property of a flock of pigeons. While this might well be possible, I think NP completeness places some hard limits on exactly what an external observer can determine about such systems.
There is an interesting problem that might be NP complete called the graph isomorphism problem. The graph isomorphism problem deals with proving that two different graphs have a one-to-one mapping showing them to be a simple transformation of each other.
So, if you have two different entities claiming to be the same entity running on a different substrate it's very hard to tell if they really are unless they tell you the mapping.
A plot element in some post-singularity novels is the idea of someone hiding themselves in various places by having themselves run on a wide variety of unusual substrates. A sort of steganography of consciousness. If the graph isomorphism problem is NP complete, then finding entities of human-level complexity who are doing this is likely practically impossible. Even the resources of a matrioshka brain are likely not enough to do the computation required to find them.
This question, generalized to the software field as a whole has been of great interest to me for a long time. And the main conclusions I've come to are that the whole topic is very complex and nuanced and there aren't a lot of simple answers.
Most interesting to me are the knee-jerk reactions, many of which are in evidence in this articles on Slashdot titled "FOSS Sexism Claims Met With Ire and Denial".
I will address a few of them here...
( Cut for brevity )And the only way I'll let myself be associated with a patent is if it's clear the patent will never be asserted against any software meeting the open source definition or meeting the free software definition. This is a concession he's unwilling to make. In particular, he wants some kind of definition for 'commercial' against which the patent can be asserted.
Oh, well. I'm going to generously allow him to use it in proprietary software if he so chooses. I'm trying to think up a set of conditions that will make sure the source never appears in public so nobody is ever tempted to put themselves in the way of a patent, should he choose to file one.
He thinks I'm completely nuts, and also thinks my principles are antithetical to his ability to make a living. *sigh* That's not how things work. The only power in ideas is if they're shared widely and freely.
I think the idea is a neat idea, but not that neat. Like all ideas it builds on and incorporates existing ones. For all I know, someone has already thought of doing something like it. I know at least one project of mine was something close.
Because of an agreement with my collaborator I have to keep a bit quiet about exactly what it is for now. But I've been working hard on it for the past 4-6 weeks or so. And it's mostly in that "It'll crash at the drop of a hat (by design) but the major functionality works." state, so it needs a lot of polishing before I'll really consider it worthwhile.
It's an implementation of a really interesting idea related to how pure functional programs handle I/O. More than that, I'm not going to share just yet. :-)
I will say that there is a lot of potentially re-usable C++ code in there for wrapping up various Unix and networking concepts in a nice pleasant wrapper. I've always thought that the way Python wraps all that stuff up in a framework that throws exceptions is very nice for rapid prototyping, and also nice for cleaner error handling.
This really excellent post makes the analogy of upgrading to IPv6 being like moving to longer phone numbers. Not that the analogy is perfect by any means, but it is useful for illustrating some important differences in the attitudes of people towards it.
One cogent commenter mentions:
The telephone equivalent of NAT is a PBX with built-in extensions, but you're right in that no one is suggesting that PBXes will relieve the burden of upgrading the phone system at some point.
autoconf is annoying to work with, and I think that programs that rely excessively on it for cross platform compatibility have issues of their own. Sometimes you really have no choice though.
Fortunately I recently discovered one place where I now do have a choice where I didn't before. This little code snippet can be optimized by gcc at compile time into a constant expression. That means that gcc realizes there is only one possible result and it uses that result in place of actually running the code in the function. Here is the code snippet:
inline bool is_little_endian()
{
const union {
::std::tr1::uint32_t tval;
unsigned char tchar[4];
} testunion = { 0x11223344ul };
return testunion.tchar[0] == 0x44u;
}
This is guaranteed to work on C99 systems, and, as I said, gcc is capable of recognizing it as a constant expression. This also means that if you have code like this:
if (is_little_endian()) {
do_something();
} else {
do_something_else();
}
gcc will be smart enough to see that only one branch of that if will ever be taken and optimize the other completely out of your code.
Normally you'd want to use autoconf for this so you will have a preprocessor macro that will elide the code for you. The fact gcc can optimize this well means you don't have to do that to get efficient code.
I've noticed an interesting oddity in IPv6 addressing...
::ffff:n.n.n.n refers to IPv4 only hosts so that a program written for IPv6 only and running on a dual stack machine can address IPv4 only hosts. There is another class of address that is similar, but not quite the same, and I don't actually understand when it would ever be used, and that class is called "IPv4 compatible IPv6 addresses" and they are of the form ::n.n.n.n.
Interestingly the IPv6 IN6ADDR_ANY address is ::, which is equivalent to ::0.0.0.0. Fortunately, the IPv4 INADDR_ANY address is 0.0.0.0 (also known as 'this host on this network' in the 'Special Addresses' section of RFC 1700) so there doesn't seem to be any real problem.
And finally, the real problem. The IPv6 equivalent of localhost or IPv4's 127.0.0.1 is ::1, and this is equivalent to ::0.0.0.1 which makes it an 'IPv4 compatible IPv6 address'. But the IPv4 address it maps to is, according to RFC 1700, some kind of local identifier for a host on a network. That seems like an odd conflict and inconsistency to me, and I'm not really sure what it means.
Of course, I've never seen any addresses in the 0.0.0.0/8 block be used at all aside from 0.0.0.0 itself, so it's likely not a real problem. But I'm still curious.
Edit 16:36: I have the answer. According to RFC 4291 section 2.5.5.1 meaning of ::n.n.n.n addresses as 'IPv4 compatible IPv6 address' is deprecated so there is no longer any overlap in meaning between the special IPv6 addresses ::1 and :: and any IPv4 address.
Well, that was the right decision, the distinction between ::ffff:n.n.n.n addresses and ::n.n.n.n addresses was confusing and unclear anyway.
I'm quite pleased with myself. :-) I've been participating on stackoverflow.com recently. HazyBlueDot had an interesting question in which (s)he was trying to use ::boost::function to get around a broken library interface.
In particular, the library interface allowed you to register a callback function, but it did not provide you a way of giving it a void * or something to pass back to you so you could put its call of your function back in context. HazyBlueDot was trying to use boost::function in combination with boost::bind to add in the pointer and then call his own function. The only problem is that the result boost::function object then couldn't produce an ordinary function pointer to pass to the callback.
This, of course, cannot be done in a static language like C++. It requires on the fly generation of first-class functions. C++ simply can't do that. But, there are various interesting tricks you can pull to generate functions at compile time with templates in ways that can help with this problem, even if they can fully solve it.
I'm particularly pleased with my solution, which looked something like this:
( Cut so that people who find that source code makes their eyes bleed don't have to look )This basically allows you to automatically generate a 'thunk' function, a normal non-member function that can be passed to the callback, that then calls another function and adds the contents of a global variable you specify as a template parameter. It doesn't fully solve the problem, but it partially solves it. And I think in this case it will do something pretty close to what HazyBlueDot wants.
Empathy has been starting to make it into Linux distributions as the default IM client. I think this is a mistake at this juncture, and this bug about Empathy not supporting OTR is one of the larger reasons why.
Another reason why is that Empathy seems to be connected with several different libraries and there is no clear sense as to what functionality lives where. It appears to be something of a spaghetti mess of libraries. I mostly figured this out because of repeated calls to 'code it or shut up' in response to the bug I posted.
One of my responses was good enough that someone else felt the need to cross-post a link to it in the Launchpad bug about lack of OTR support in Empathy.
I will cross-post it here:
( My comment on OTR in Empathy at bugs.freedesktop.org )Someone else goes on later to suggest that Empathy support some horrible idea like TLS over XMPP. Which, in addition to being an awful idea for any number of reasons, also fails to address the issue of support for any protocol aside from XMPP.
In order for encryption to be useful in a communications system, everybody has to be able to use it whether they want it or not. It should be a first-class feature designed in from the very beginning, not tacked on as an afterthought (something that OTR in pidgin fails at) and certainly not treated as unimportant because only a few really want it.
This:
unsigned int clipdigit(unsigned int * const v)
{
unsigned int digit = (*v) % 10;
(*v) /= 10;
return digit;
}
is turned into this:
.globl clipdigit .type clipdigit, @function clipdigit: .LFB11: .cfi_startproc movl (%rdi), %ecx movl $-858993459, %edx movl %ecx, %eax mull %edx shrl $3, %edx leal 0(,%rdx,8), %eax movl %edx, (%rdi) leal (%rax,%rdx,2), %edx movl %ecx, %eax subl %edx, %eax ret .cfi_endproc .LFE11: .size clipdigit, .-clipdigit
As a small hint/bit of explanation, 232 - 858993459 = 3435973837 = 235 / 10 + 2.
Is mull really that much faster than divl on x86_64 machines?
I was expecting to get code more like this rather straightforward bit:
.globl clipdigit
.type clipdigit, @function
clipdigit:
.LFB11:
.cfi_startproc
movl (%rdi), %eax
movl $10, %ecx
xorl %edx, %edx
divl %ecx
movl %eax, (%rdi)
movl %edx, %eax
ret
.cfi_endproc
.LFE11:
.size clipdigit, .-clipdigit
It turns out in testing that the second clip of code is much, much slower than the first clip. The strange mull method is about 5 times faster than the straightforward divl method. Wow, divl seems really broken if it's that slow.
While I do not recommend that anybody develop anything for the iPhone, I was recently investigating something about it.
I would like to repeat that I do not recommend that anybody develop anything because Apple's policies make it questionable as to whether or not your app will ever make it to the app store at all. They have no compunctions about refusing apps for the most bizarre of reasons, and even worse, refusing apps because either Apple or AT&T perceives them as somehow competing.
If you want to develop for a mobile phone platform, go for the Android. That is a clear and open market.
That being said, I did have reason to investigate C++ on the iPhone recently, and I came across these 3 well-written articles by someone who's tried to develop C++ on several different mobile phone environments. I would like to point to them here because other people looking for this information should be able to find it easily, and so I can find it easily.
So yes, it appears the iPhone does C++ just fine. That guy was promising to write more on how Objective-(C/C++) and C++ mixed. But I don't think he ever got around to it.
The publisher of several books by George Orwell decided that they didn't like the fact that they'd published them electronically. Many people had bought these books for their Kindle. Mysteriously, these books completely disappeared from people's Kindle book readers.
In my humble opinion, people who bought a Kindle deserve exactly what they got, and I hope Amazon does it again. If you buy into DRM in any way you are asking for stuff like this to happen to you. The reasonable response is not to complain bitterly about how unfair it is, but to not buy DRM enabled products.
People seem in a terrible rush to trade away rights that are essential in the rush to convenience. They spare little thought for what they're doing and then act surprised at the ultimate result.
At the recent Convergence I was on a panel about copyright. People there persisted in calling copyright a 'property right' and referred to the vast network of weird and wonderful rights that are patents, trademarks and copyrights as 'intellectual property'. I object strongly to the conflation of trademark, patent and copyright into 'intellectual property'. The rules around each are very non-property-like and very different from each other.
And Techdirt comes to the rescue again with an article about how in many ways copyright is very much not a property right.
I haven't played it yet, but I would like to note that Frictional Games has released Penumbra for Linux.
Blog of Helios wrote a nice blog entry detailing why this is such a great game. Unlike World of Goo, I'm not sure how well it will work on older hardware.
It's really nice to start seeing publishers of really good games start supporting Linux. The smaller publishers in the games industry tend to make the most interesting games, and it's the smaller publishers that have been doing Linux ports. Even though the larger publishers make less interesting games, their games tend to be more popular. I hope that the success smaller publishers have with porting leads larger publishers to start doing the same thing.
Games are an exception to my rule about using all Open Source if I can help it.
This game is a bit tricky to install on Linux, mostly because of library dependencies, especially on a 64-bit system. Frictional could use some install advice from the nearly trivial to install World of Goo game.
It's World of Goo. It's like a mad cross between Fantastic Contraption and Lemmings.
The best part (in my world anyway :-) is that the game is available for Linux. I wish more game companies would start doing this. It isn't that hard, and there is a market for it. Nearly as much of a market as for Mac games.
The game has gotten some fantastic reviews. It is light-hearted and bizarre, and the physics simulation based puzzles are highly entertaining. Other comparisons that come to mind are the work of Tim Burton and the video game Worms, more for artistic style than anything about how gameplay actually works.
I've been working on coming up with a nice C++ (or, actually, C++0x) interface to Skein hash function.
Skein has an interesting tree mode in which it's possible to parallelize the hash function calculation to a significant degree. I wanted to write a general interface for this so I could make a command line utility that used it to test it against sha256sum command.
Applying a tree hash to an existing file is a no-brainer. But I wanted to be able to handle much more general cases in which the leaf data may not be available on a random-access basis. In particular if the file is coming in on stdin or something similar.
I was having difficulty coming up with a general extensible interface for this. Partly the interface for the system for handling leaf data needed a way to allocate chunks of leaf data to work on, and then release them. This would allow for a sliding window type approach to fetching leaf data.
My biggest and most recent breakthrough was realizing that the leaf data objects were like ::std::auto_ptr objects. I didn't want to force heap use, so I needed the data about a leaf to be copyable. But I didn't want to have to have any kind of silly reference counts or anything like that. So that meant I needed it to be moveable, not copyable. Just like auto_ptr. But auto_ptr is a klduge in C++. In C++0x there is a very nice concept called rvalue references that let you implement move semantics very cleanly.
It took me awhile to realize I wanted move semantics. I kept on beating on the interface and coming up with usage scenarios that were just awkward and broken. Once I figured it out, things went a lot easier.
Here is a link to what I finally came up with: skeintreepp.hpp.
Many years ago I was presented with an interesting puzzle. I did not solve it. I had to be told the answer. It is a very simple answer, but it isn't dumb. Figuring it out requires a significant leap of lateral thinking.
I figured I'd put it here for the amusement of others. Comments will be screened. I will eventually reveal the correct answer, but it may be a few months. :-) And now for the puzzle itself:
What is the next number in this sequence?
Well, actually, a whole lot of stuff, including spending some time with
klicrai and someone who tends to go by Crymerci online, but doesn't have an active LJ.
But, the topic of this LJ entry is the work I've been doing on creating a nice C++ interface for Skein, Bruce Schneier and friends entry into the NIST SHA-3 competition.
Anyway, I'm basing my work on the code in their NIST submission. And, as I said before, it's a nice C++ wrapper for Skein that attempts to give access to all of Skein's nifty features that are in excess of the NIST requirements. It's a hash function, it's a PRNG, it's a MAC algorithm! No, it's Skein!
One kind of gets the impression that Bruce and company are miffed by the fact that Rijndael was chosen over Twofish for the AES competition. And their response was to create a new algorithm that was faster, probably more secure, inordinately flexible and impossible to level the "it's too complicated to analyze effectively" criticism against. That criticism was one of the reasons Twofish wasn't selected in the AES competition.
There is an interesting new result showing that the distribution of prime numbers obeys a modified version of Benford's Law. The result also shows that another sequence who's distribution is somehow fundamentally related to the distribution of primes, the 0s of the Reimann zeta function.
It is my feeling that results like this do not strongly affect the usefulness of prime number based cryptography algorithms like RSA. But this is just a guess on my part. Does anybody have a more definitive answer?
The Daily WTF is a publication I generally really enjoy. Their recent article Java is Slow! is one I sort of take issue with.
I don't like Java or anything to do with it. I've said that before in this blog and I'll say it again. I can write a Python script to print "Hello world!" that takes less time to run than an equivalent Java program.
There are sweet spot for being simple to develop in, speed and language verbosity. Java completely fails to hit any of them. It has many of the ills of a compiled language with regards to how easy it is to develop in, and is even slower in many regards than most interpreted languages, and it's almost as verbose as COBOL.
And many of you will complain that Python is definitely slower than Java and point at benchmarks. The benchmark I care the most about though is "Hello world!", and that's a benchmark Java fails miserably at.
The reason I care so much about that particular benchmark is that Java's miserable failure at it is seen by people who want to use Java as a reason absolutely EVERYTHING should be in Java. Because the JVM is so expensive to start, you should only start it once, and then Java should become your OS.
I'm sorry, but no. Python has a perfectly acceptable VM, and it starts up in 10s of milliseconds or less. It's not compiled, so it's quick and easy to develop in, and maybe it isn't as fast in the long haul, but those other attributes more than make up for it.
If I really care about speed and want to use a compiled language, I will use C++. If I don't and I want a nice, easy to develop in language, I will use Python. Others can use Perl or Ruby if they want to. Java has no reasonable place in the development landscape, and it never will.
Navigate: (Previous 20 Entries)