Feeds:
Posts
Comments

Posts Tagged ‘Ruby’

As you build a relationship with various languages, it seems like each one has its own little points of attractiveness that you miss having around when forced into another context.  For instance, Ruby (and in many ways Perl) has a bunch of really succinct, elegant operators that contribute to the satisfaction of using the language.  Honestly, Ruby has this delight thing going for it that is difficult to pinpoint or quantify, but absolutely has a presence sorely pined after when working with other languages.

Since operators tend to be easy to overload in many languages, I try to bring the more helpful ones along from one language to another.  F# has the nice property of not only supporting operator overloading, but operator definition as well.  This is great for when you want to grab an operator from another language and add it to the mix.  Since F# doesn’t implement many of my favorite Ruby operators, we’ll put together some implementations here.

A Quick Word

F# has a very straight forward syntax for declaring both infix (binary) and prefix (unary) operators, which can be stand alone definitions or static members of types.  Operator definitions are distinguished by parenthesis around the operator identifier.  Here we have a stand alone definition of a robust division operator that returns a tuple of the result and the remainder.

    let (/%) left right =
        left / right, left % right

This operator is assumed to be infix because it has two parameters.  If we had just a single parameter, it would be considered prefix.  We could place this same definition on a type as a static member as well.

    type System.Int32 with
        static member (/%) left right =
            left / right, left % right 

EDIT:  It turns out that operator type extensions are currently unsupported as of the Sept 2008 CTP.

In either case the operator can then be used like any other:

    > 50 /% 15;;
    val it : int * int = (3, 5)

F# lets you create operators out of any* sequence of the following characters: !$%&*+-./<=>?@^|~: so knock yourself out.

*There are some special rules, for example any operator that starts with anything in the set of { !, ~, ? } must be unary.

Sequences, Lists, Arrays, etc

Many of the operators defined below operate on collections.  I’m choosing to make these List operators since I work mostly with Lists and the results of the operation print nicely for blog purposes (as opposed to sequences that only print 4 deep).  In practice, I would implement these on Sequence so that any collection type is supported, but these examples are just for demonstration.  Further, any of these can easily be added to the any of the standard collection types directly (list, array, etc) by using the type extension syntax above and replacing calls to the List module with calls to the appropriate function on Seq, Array, or whatever.

Also, the F# compiler is protective of well known operators and will emit warnings if you change (extend) their definitions.  Primarily you’ll see things like: warning FS0086: The name ‘(&)’ should not be used as a member name. If defining a static member for use from other .NET languages then use the name ‘op_Amp’ instead. To avoid these clashes I’ll throw a dot at the beginning of my new operators.

The Ruby Collection Operators

The operators defined for Ruby collections are intuitive, concise, and handy for many common activities.  Need to identify who hasn’t responded to an invite or which nodes haven’t completed an operation?  Use the difference operator.  Need to identify elements common to several collections?  Use the intersect.  None of these are ground shaking new concepts, but the terse, clear representation is great for readability and utility.

The difference and intersect operators make use of a temporary HashSet to allow constant time lookup of values in the right hand collection.  This yields O(m+n) time complexity instead of the O(m*n) you’d see in a brute force linear lookup.

Concatenation (+)

The concatenation operator is very straight forward and produces a composition of both collections allowing duplicates.

    let (.+) left right =
        List.append left right

    > [1;2;3;4] .+ [3;4;5;6];;
    val it : int list = [1; 2; 3; 4; 3; 4; 5; 6]

EDIT:  There is also a (@) operator that achieves this behavior.

Difference (-)

As expected, the difference operator produces a collection of items contained in the left argument that do not appear in the right argument.  Again, this is nice for singling out people who haven’t responded to an invite or operations that haven’t completed.  The HashSet is chosen because of its constant time lookup.  We avoid in this case the standard Set since it is based on a binary tree and thus yields logarithmic time.

    let (.-) left right =
        let mem = HashSet<_>(right)
        left |> List.filter (fun n -> not (mem.Contains n))

    > [1;2;3;4] .- [3;4;5;6];;
    val it : int list = [1; 2]

Intersect (&)

The intersect operator makes use of the HashSet similar to the difference operator and produces a collection of items common to both the left and right arguments.  Useful for “you might also like this”-style suggestion algorithms and trend analysis.

    let (.&) left right =
        let mem = HashSet<_>(right)
        left |> Seq.filter (fun n -> mem.Contains n)

    > [1;2;3;4] .& [3;4;5;6];;
    val it : int list = [3; 4]

Union (|)

The union operator produces a collection of items similar to concatenation, but with uniqueness of constituent items.

    let (.|) left right =
        Seq.append left right |> Seq.distinct |> List.of_seq

    > [1;2;3;4] .| [3;4;5;6];;
    val it : int list = [1; 2; 3; 4; 5; 6]

Repetition (*)

In Ruby and Perl the repetition operator applies not only to collections, but strings as well.  For our case here we implement the collection version.  One useful case is in generating input sequences for automated tests.

    let (.*) n list =
        let res = List.empty
        [1 .. n] |> List.fold_left (fun acc i -> List.append acc list) res

    > 3 .* [1;2;3];;
    val it : int list = [1; 2; 3; 1; 2; 3; 1; 2; 3]    

Regular Expression Binding Operator (=~) (Perl/Ruby)

For simplifying the representation of regular expression matching, Perl and Ruby make use of a Binding operator that tests the string on the left hand side against a pattern on the right.  If the string is a match for the pattern, the expression returns true.

    let (=~) s pattern =
        Regex.IsMatch(s,pattern)

    > "Mississippi" =~ "M(i(ss|pp)*)+";;
    val it : bool = true  

A Functional Binding Operator (=~~)

One Perl/Ruby facility not included in the .net Regex implementation is the population of $~ variables after a match operation.  If the matched pattern had groups that need to be referred to, the $1, $2, etc variables can be used after the =~ match to access the data.  In our non-Perl/Ruby case, we don’t get the magic $~ variables so the =~ isn’t as useful since we don’t have a resultant Match object with our data.  To deal with this we’ll make a new operator =~~.

    let (=~~) s pattern =
        let m = Regex.Match(s,pattern)
        if m.Success then Some m
        else None

    > "Mississippi" =~~ "M(i(ss|pp)*)+";;
    val it : Match option = Some Mississippi {Captures = seq [...];
                                Groups = seq [...; i; pp];
                                Index = 0;
                                Length = 11;
                                Success = true;
                                Value = "Mississippi";}  

The idea here is to wrap the Match object in an option so we can use the operator in a pattern matching context rather then a simple conditional.

    let testMatch s =
        match s =~~ "^M(i(ss|pp)*)+" with
        | Some m -> printf "%s\n" (m.ToString())
        | None -> printf "No Match"    

This avoids the need to rely on implicit $~ variables and feels much more like a functional idiom.

The Null Coalescing Operator (??) (C#)

For dealing with many common situations where a potentially null value needs to be replaced with a dependable backup, C# uses the null coalescing operator.  EDIT: the F# compiler ignores leading . and $ characters not only for determining the precedence of operators, but also for identifying if they are prefix or infix.  For clarity, I’ll avoid use of ?? altogether in favor of $$.

    let ($$) left right =
        match left with
        | null -> right
        | _ -> left   

This tests the left hand side for null and if a non-null target was given that value is used.  Otherwise, the right hand argument is used instead.  This is very similar to the Ruby idiom of testing a value with ||= to initialize with a known value in the case of nil.

A Functional Null Coalescing Operator

The standard null coalesce is useful when dealing with .net objects from libraries external to F#.  When working with F# types however, null doesn’t come up anywhere near as often as option types since it is preferable to allow the type system to handle null (None) values.   Again, to feel more at home in a functional context, we’ll create a new operator $$$.

    let ($$$) left right =
        match left with
        | Some _ -> left
        | _ -> right  

This then allows you to test a value that may potentially be None and substitute a known value should the case arise.

    let userPrefs = getUserSpecifiedPrefs() $$$ getDefaultPrefs()   

Nice.  So yeah, these are my favorites.  I’d love to hear feedback on improvements that can be made or interesting operators from your favorite language that I’ve ommited.

Enjoy.

Read Full Post »

ForEachLine

To aid in the development of my main application work, I often find myself writing lots of little C# apps to analyze diagnostic bread crumbs left behind by in-development code. Yeah, I said C# and not Ruby or Python or whatever. This is mainly because of the Linq facilities and what they offer for analyzing and projecting lists of data. Because these are C# apps and the bread crumbs typically take the form of files, I would constantly write code to this effect:

    var file = @"C:\logs\diag.log";
    if (File.Exists(file))
    {
      using (StreamReader reader = File.OpenText(file))
      {
        while (reader.Peek() > -1)
        {
          doSomething(reader.ReadLine());
        }
      }
    }

This really grows tiresome when all you really want to do is act on the content of the file. Ruby has a nice shortcut for this using the foreach class method on File (or IO).

    File.foreach("C:\\logs\\diag.log") { |line| doSomething(line) }

As it turns out, this sort of behavior is easy to add to C#:

    public static void ForEachLine(string file, Action action )
    {
      if (File.Exists(file))
      {
        using (StreamReader reader = File.OpenText(file))
        {
          while (reader.Peek() > -1)
          {
            action(reader.ReadLine());
          }
        }
      }
    }

which opens the door to calls like

    ForEachLine( @"C:\logs\diag.log", line => doSomething(line) );

Thats nice and everything, but I wanted to make an extension method out of this and put it on the File class (which would be the intuitive thing to do). The problem is that extension methods in C# as of 3.0 are only allowed to be instance methods. So I’m denied the File.ForEachLine() invocation. I don’t like the prospect of having to construct a FileInfo instance to call this since in order to do that I would have to write all the boilerplate code this call is intending to avoid in the first place.

There is, however, a lightweight object that is already being created in every one of these cases that actually winds up making sense as the target of this call.

the string.

If we stick our call from before in an extension class and change the first param to be the call receiver, we get this:

    public static class Extensions
    {
      public static void ForEachLine(this string file, Action action )
      {
        if (File.Exists(file))
        {
          using (StreamReader reader = File.OpenText(file))
          {
            while (reader.Peek() > -1)
            {
              action(reader.ReadLine());
            }
          }
        }
      }
    }

This might not look too different, but allows us the following syntax:

    var file = @"C:\logs\diag.log";
    file.ForEachLine( line => doSomething(line) );

or even

    @"C:\logs\diag.log".ForEachLine( line => doSomething(line) );

This is succinct, intuitive, and makes those little C# apps feel a little more Rubyish. That might even be a good thing.

Read Full Post »