Navigate/Search

Type Inference

Type inference will be in your favorite (mainstream) statically typed language within the year, if it is not already there. There are a few gotchas, but type inference is a tool that can make you code easier to read, thus improving maintainability. It also shortens development time by dozens of seconds.

Type inference is syntax sugar. No more, no less. In a nutshell, type declarations are replaced with a keyword such as auto or var. The compiler then determines the type of the declaration by its context. The best way to explain may be through an example from my job.

Without type inference, there is a lot of duplication:

  1. // I actually had to write some c++ code that was a actually
  2. // worse than this the other day. In my case the namespaces
  3. // had to be preserved and the type names were much longer.
  4. EmailService::TypeNotificationTag WARN =
  5.     EmailService::TypeNotificationTag(
  6.         EmailService::TypeNotificationTag::WARN);
  7. EmailService::DomainNotificationTag SELLER =
  8.     EmailService::DomainNotificationTag(
  9.         EmailService::DomainNotificationTag::SELLER);
  10. . . .
  11. EmailService::sendEmail(emailDetails,
  12.     TagSetBuilder(WARN).add(SELLER).getTagSet());

Notice that the long type name requires the declarations to be wrapped and that the type is mentioned three times per declaration.

With type inference the code is just enough cleaner that it is easier to understand.

  1. // With type inference
  2. // Ed: These normally wouldn’t be wrapped,
  3. // but the ’screen width’ is small
  4. auto WARN = EmailService::TypeNotificationTag(
  5.     EmailService::TypeNotificationTag::WARN);
  6. auto SELLER = EmailService::DomainNotificationTag(
  7.     EmailService::DomainNotificationTag::SELLER);
  8. . . .
  9. EmailService::sendEmail(emailDetails,
  10.      TagSetBuilder(WARN).add(SELLER).getTagSet());

Here is a simpler example that may drive the point home a little more.

  1. // without type inferrence
  2. std::map< std::string,
  3.           std::vector<std::string>
  4.         >::const_iterator iter = items.begin();
  5.  
  6. // with type inference
  7. auto iter = items.begin()

Type inference is not dynamic typing.

How is a duck like a bicycle?
They both have handlebars <pause> except for the duck.

Javascript, Ruby, and Python, all use dynamic typing: The type of the object is determined at run time. Dynamic typing is sometimes refered to as duck typing: if it acts like a duck, use it like you would a duck. While it has a lot of advantages, there can be problems with it. Namely, what happens at run time if you attempt to access an expected method that is not there—even when the rest of the object acts like a duck?

Type inference is not dynamic typing. You actually only need type inference in languages that are statically typed like Java and C++. In type inference, the type is determined at compile time by examining the context of the variable. Good implementations would pick the most abstract type possible; that is, the farthest up the inheritance chain.

Gotchas

There are a few things to know before jumping into throwing autos into your code everywhere. Remember that the compiler needs some sort of context to determine the type of the variable, so type-inferred variable must be assigned to some expression. For example:

  1. auto bad; // no context
  2. auto good = 1;
  3. auto goodItem = new Item<std::string>();
  4. auto goodThingy = goodItem.createThingy(good, alsoGood);

The last two lines in the example leads into a maintainability gotcha. The reader has to know what type is returned by createThingy to know what type goodThingy is; taking valuable time from the code maintainer.

Part of knowing the type is knowing the constness of the variable. This can be especially hard with c++ and const correctness. If there are const and non-const versions of creatThingy, is goodThingy const or isn’t it?

C++ and D developers are not alone in their const troubles. In Java 7, all type-inferred variables will be final.

Language support

Most mainstream static-typed languages have or will have support for type inference.

  • C# already has it
  • D Programming language already has it.
  • Java 7 will have it.
  • C++0x will have it.
  • C++ with TR1/boost can use a macro to simulate it today.
    1. #include <boost/type_traits.hpp>
    2.  
    3. #define AUTO(name, init) \
    4.     boost::remove_reference<typeof(init)>::type name = init
    5. #define AUTO_REF(name, init) \
    6.     typeof(init) name = init
    7. . . .
    8. AUTO(iter, items.begin());
    9.  
    10. // as oppossed to this from the example above
    11. std::map< std::string,
    12.                std::vector<std::string>
    13.         >::const_iterator iter = items.begin();
    14.  
    15. // use AUTO_REF when you need to make a reference
    16. AUTO_REF(ref, *item);

Conclusion

Major languages are getting type inference which is a pleasant syntax sugar to ease readability. When you know the few gotchas, it can save you time and screen space.

Resurrection

Viola! I have a blog again. However, don’t look too hard at the comments yet: I am still importing it from my last domain so the comments are pointing to the wrong posts. I should have it fixed by the end of the week.

I am also waiting for my webhost to give back my directory permissions so I can add all my cool widgets (especially the code colorer).

All My Worth

I will begin my career in a few weeks. Within in a ten-day span, I will have graduated, moved my family halfway across the country, and begun work in a prestigious company. That means going from working a campus job ten intermittent hours per week to forty full hours per week. I will be going from an extremely low-paying job to an extremely high paying job.

My new career has brought up two major concerns that have been on my mind for the last few months. First, how do I keep focused on the clock for forty hours a week; how do I maintain enthusiasm and fight boredom? Second, how do I handle all that new money that passes through my pockets?

I have solved the “problem” of increased income. After studying Dave Ramsey’s The Total Money Makeover and the Warrens’ All Your Worth—both extremely recommended books—I have decided to make a beautifully simple budget for our family. The budget contains a little “gazelle-like intensity” from TMMO, but is primarily from All Your Worth, mostly because it does the least amount of penny counting.

All Your Worth suggests a budget based on a 50-20-30 balance of Must-Haves, Savings, and Wants.

Must-Haves, weighing in at 50% are anything that you have to pay for, even if you have lost your job. House payments, basic food needs, insurance, and any legal or contractual obligations are Must-Haves.

Savings, pulling a hefty 20%, include retirement funds, debt re-payment, and traditional savings.

Wants is guilt-free money. Do as you please. Internet, vacations, clothes, and eating out are all Wants.

Because I am coming straight from college, with minimal student debt, and no oversized house or car payments, I can live with “gazelle-like intensity” and skew my Must-Haves and Wants into Savings. I am projecting a budget with the ratios 36:47:17. My student loans will be paid off in two months, and we will be well on our way to saving for a house and retirement.

The idea of percentages percolates down into the Wants category. We gave 50% to the whole family, 20% each to my wife and me, and 10% to our son. With our new income, 3.3% of the monthly income (20% of 17%) is huge to me. It is extremely simple. We have no complaints.

Wants

I encourage everyone to get a copy of All Your Worth by Elizabeth Warren et al. from your library. It presents an excellent method to get and keep your wealth in balance.

Data-structure slicing

Question: When obtaining a slice of a data-structure, what kind of object should be returned?

  1. // examples are from the D programming language
  2. // http://digitalmars.com/d/
  3. foreach (i; collection[7..31])
  4. {}
  5.  
  6. // what is the type of sub?
  7. auto sub = collections[7..31];

Solution 1: A heavy-weight collection that copies the data into its own structure. While this is useful, it occurs the overhead of copying the data into its own structure. This is O(n) in the best case (a list) and O(n log n) in the worst case (a tree).

Solution 2: A light-weight, iterable-only structure. This can be used iterate over a piece of the structure, it can be used to construct a new structure, and (most sweetly) it can be made with no heap activity. The semantics of the original container, however, are lost.

Solution 3: A middle-weight collection sharing the implementation and all or most features of the parent. This can be made into its own heavy weight container via sub.dup and is fast to construct—heap activity aside. In implementing this option I would only expose a immutable view of the collection.

The middle-weight and the light-weight solutions are both useful–the middle-weight solution being the most dynamic–but both can easily be combined into one container. The question of being most used determines which gets the easiest syntax.

Middle-weight prefered syntax:

  1. foreach (i; collection.each(7, 31)) {
  2.     …
  3. }
  4.  
  5. auto sub = collection[7..31];

Light-weight prefered syntax:

  1. foreach (i; collection[7..31]) {
  2.     …
  3. }
  4.  
  5. auto sub = collection.sub(7, 31);

Of the two of these, the latter is preferred because it is likely that people will otherwise forget to use the light-weight syntax. This will increase the overall speed of using the container, especially since, as far as I can tell, iteration will happen more often than creating an actual slice.

Of course, some magic can be performed to get the sub call to use the nice slice syntax.

  1. interface CollectionView {
  2.     alias int delegate(inout int) Func;
  3.     int opApply(Func);
  4. }
  5.  
  6. class Collection : CollectionView {
  7.     private int[] array = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
  8.  
  9.     int opApply(Func dg) {
  10.         foreach (i; array) {
  11.             int ret = dg(i);
  12.             if (ret) return ret;
  13.         }
  14.         return 0;
  15.     }
  16.  
  17.     private struct Sub {
  18.         private Collection actual;
  19.  
  20.         CollectionView opSlice(uint start, uint end) {
  21.             class InnerView : CollectionView {
  22.                 int[] subArray;
  23.                 this(uint s, uint e) {
  24.                     subArray = actual.array[start..end];
  25.                 }
  26.                 int opApply(Func dg) {
  27.                     foreach (i; subArray) {
  28.                         int ret = dg(i);
  29.                         if (ret) return ret;
  30.                     }
  31.                     return 0;
  32.                 }
  33.             };
  34.             return new InnerView(start, end);
  35.         }
  36.     }
  37.  
  38.     public Sub sub() {
  39.         return Sub(this);
  40.     }
  41. }
  42.  
  43. int main() {
  44.     Collection coll = new Collection;
  45.     CollectionView view = coll.sub[3..7]; // < — good –
  46.     auto x = coll.sub; // <———————– bad —
  47.     return 0;
  48. }

As you can see, there is a lot of extra code just to replace the parens with brackets and and a comma-space with double-dots. Is it worth it? Definetly not, it is just a novelty. The extra object makes the collection harder to maintain, so we will just stick to good-ol’ function call syntax.

There is also the side-effect, pointed out in the code, of letting that extra structure slip through. That extra structure can slip through when doing the light-weight slice as well. I’ll have to look into how to disable that.

Conclusion: We can support both light- and middle-weight solutions to getting a slice of the collection. By having the syntax prefer the light-weight construct, we can assure some speed gains by avoiding needless heap activity.

Iteratator Design

Q: Why would anyone use iterators when there is a foreach construct?

  1. // examples in the D programming language
  2. // http://digitalmars.com/d/
  3. auto iter = collection.iterator;
  4. while(iter.more) {
  5.     auto v = iter.next;
  6.     // …
  7. }
  1. foreach (v; collection) {
  2.     …
  3. }

Answers:
1. Enumerable is the only common interface.
Why? Mixing I/O streams and containers.
Solution:
Make ‘enumerable’ compatible with the foreach construct (ie. it implements opApply).

2. Iterators are used to transform the data structure (ala c++ stl).
Solution A:
Have mutable iteration either as default or as an extra call.

  1. foreach (inout i; collection.mutating)
  2.     i++;

Two reasons to have them as an extra protocol:

  1. The normal interface should be used for guarded operations
  2. Some collections (eg. sets) should not be able to be mutated in such a manner. Although this can be overcome by the default iterator itself, it supplies a consistent interface.

Solution B:
But what about the “walk of death?” One idea I’ve seen is a purging iterator.

  1. foreach (inout purge, inout i; collection.purging) {
  2.     if (i.stale)
  3.         purge = true;
  4.     else
  5.         i++;
  6. }

3. Iterators are used as markers.
Why? It Gives constant time access to elements in expensive constructs, such as
linked lists.

  1. list.collection.insert(i, b);

This kind of operation greatly speeds insertion in middle of lists; it is
otherwise a O(n) operation: the cost to move every element in an array-based
list, as well as the cost of finding position i in a linked list.

This should not be used with containers with no concept of order, such as
hash-based constructs.

Solution A:
Make an easy way to iterate from one iterator to another or to the end of a container

  1. foreach (i; Range[iter1..iterEnd])
  2. {}
  3.  
  4. foreach (i; Range[iter])
  5. {}
  6.  
  7. foreach (iter, i; IteratorRange[iter1..iterEnd])
  8. {}

Solution B:
Make it easy to use the iterator in connection with the collection.

  1. collection.insertBefore(iter);

Solution C:
Make it easy to re-get the value from an iterator.

  1. iter.value

Solution D:
In light of Solutions A-C, rename Iterator to Marker and have it non-enumerable (ie. it only has a value() method.) This, however, will break compatibility with I/O iterators; therefore, it will probably never happen.

4. The iterators are wrapped into transformations or chains.
Why? Self evident-but highly useful.

5. Parallel Iteration
Solution:
Why? It just cannot be done in a foreach construct.

6. Speed.
Is this really an issue? Some of the solutions above assume a speed aspect (2A,B and 3B, for example).

Conclusions:

Looking at these answers, I can see which ones are likely to be useful, at least used most often. Iterators are not cursors (markers). The concern of speed whilst inserting into the middle of a data structure should govern the use of the data structure. Of course, if the data structure does not support that crucial operation it has lost value; but in the case of the a structure like a deque is more appropriate.

Markers are out, but I have learned some useful things in this little exercize.

Information Thief

Philip Parker is writing books. Well not really. Phillip parker has a half dozen programmers and a large bank of computers, and the computers write the books. I don’t mind too much that the computers are writing the books. What I do mind is that the computers are stealing information to write the books. The sentences and graphics are lifted from people’s web pages and organized into a book, slapped into a cover and sold. That, my friends, is unethical conduct in the highest degree.

Women in Computing

With the differences in the way men and women think, we sure could use a lot more women in the computer sciences. Women have the distinct advantages of being able to multitask and having a higher connectivity between the brain hemispheres. Usability would go up. Way up. A different set of solutions to problems would emerge, and even different problems would be solved. Productivity would probably go down initially (Not due to gossip; I mean all the distracted men, but that’s a different issue).

He’s back

Frank Quattrone is back. Last time he reared his head, he made millions of dollars while everyone else went bankrupt. (I’m talking about the dot-com bubble.) Now Google has hired him as a consultant. I’m not sure if I want to chide Google for forgetting their “Don’t be evil” mantra, or for congratulating them on picking the best man for the job.

Copyright laws

Here’s a plan for copyright laws1: I can listen to any music that I own a license to; I can watch any of your movies on your DVD player. I can also give you my license as a gift, or lend you my license (which could automatically return to me after a specified amount of time).

This is becoming more feasible as more devices come with some sort of wireless communication. Your license database stays in sync with all your devices using a common protocol; there’s no problem there. Using the license as the decryption key would keep you from listening to another’s music.

The weakness is in license copying and selling. International laws will have to be made that disallow you from giving away a license without removing it from your database. You would also have to be legally authorized to sell a specific license. Once this rather large obstacle is overcome, I think the scheme would work very well.


Footnotes

  1. I got the beginning of this idea from Dr. Chuck Knutson []

Closing my browser

I used to have a big problem with the internet. Not in the sense that I don’t like it, rather, that I like it too much. I would spend hours upon hours learning about things—mostly programming—to the extent that I never got anything done. To a much lesser degree, I still have this problem. I can think of two causes: my attention span is short (I get tired of a project after a few weeks so I move on to a new shiny object) and I love to learn new things. To combat the problem I’m doing two things: getting organized and closing my browser.

Getting Organized. I am implementing David Allen’s Getting Things Done. Once I get myself setup, I’ll know exactly what I need to do. Sure, there is still that element of procrastination, but I’m just going to have to recognize that I am putting things off and stick to my implementation.

Closing my browser. My browser stays open constantly. I close it every week or two to free up memory, but it immediately reopens. No more: it is a distraction keeping me from my job and school. I will have on “online” context in my Getting Things Done implementation, and only open my browser when I need to.

By closing my browser and getting organized, I will put away the internet and do the things I need to do.

Update: I forgot disclose my secret to keeping the browser closed while needing to have it close at hand—I am a computer scientist, after all. I have Google Desktop, so I hit <Ctrl> twice and it brings up a search box, and I have several keyboard shortcuts that bring up oft-used places

Also, thanks to Matt Connew for pointing out that I got my Allens mixed up. It was David, not Paul that wrote GTD.