Programming Blog

This is my diary on keeping track of my progress to implement a programslicer in Haskell. Mails go to roman at bromeco.de.


Sat Oct 17

Had some good progress on State. I’m currently struggling with implementing the bind function =<<, but found this wonderful article on fpcomplete I’m currently studying.

Update: It verks. The best take away of the article for me was this:

When you execute a lambda, you simply replace it with its body and replace the formal parameter with the actual argument.

Mon Oct 12

Stuck at implementing fmap for State.hs. My mental model is centred around the fact to apply the given function (a -> b) on a concrete value, but any attempt to isolate that value a fails. Reading this article makes me wonder if function composition is making more sense here …

Update: Got a bit further with a bit of help from #bfpg. I’ve removed the State wrapper and tried to implement:

f :: (a -> b) -> (s -> (a, s)) -> (s -> (b, s))

My first mistake was to someone call the function in the second parameter in order to apply the first parameter (function) to the value in order to get b. As the state monad article already stated, we function composition is the key here. Think of the above type annotation as:

f :: (s -> b) -> (s -> a) -> (a -> b)
-- (.) :: (b -> c) -> (a -> b) -> a -> c

But that’s not all, just using function composition like:

f fx gx = fx . gx

leaves us with a compiler error: Couldn't match type 'b' with '(b, s)'. Using a type hole in front of the fx tells us we need a function:

foo :: (a -> b) -> (a, s) -> (b, s)

which makes the implementation way more straight forward. Apparently thats exactly what first from Control.Arrow provides.

Update: I meant first from Data.Bifunctor not Control.Arrow. Thanks to Fraser Tweedale, I got the “unwrapped” function right, but struggled putting it all back together with in:

instance Functor (State s) where
  (<$>) ::
      (a -> b)
      -> State s a
      -> State s b

Turned out that I had just a “spelling mistake” referring to the wrong bound variable in my function.

Tue Oct 06

Progressed and now stuck at Bind.hs. Using type holes I got a solution:

(=<<) f g = flip f <*> g

Unfortunately it ends up in an infinite recursion :(

Update: Found it. Got a hint from Tony Morris on #bfpg. Essentially what it came down to is to take the t out of the second function and proceed from there. How do I take out a parameter from a given function. I had to go back to basics. This stack overflow question helped me to understand, especially this part:

(Remember that f a b = ... is basically a shortcut for f a = \b ->
... which is a shortcut for f = \a -> \b -> ....)

My mental model always thought of functions as something like a callable, e.g.

def foobar(callable):
    callable(a, b, c)

but there is a reason why people emphasize that functions always take one parameter.

Sat Sep 26

Working on the Applicative function replicateA brought back memories. I tried to use replicate out of the Prelude but couldn’t get it to work. I doubted the ability to use the function at all, since it used general quantified types a and not functors f a. But turns out it can be used. And of course. If it would have not been the case it would have almost rendered the functional programming language unusable if you need to implement almost the same function for different types.

Tue Sep 21

Got a bit further with the help of folks on #bfpg. I’ve a const function which applied to two functors returns the second: fconst _ x = x. Applied this to the first functor and mapped over the second functor: (*>) f g = fconst f <$> g I get the first result. The tricky thing is that it doesn’t map over the entire first functor which it should be: (*>) :: Apply f => f a -> f b -> f b

Update: Yay!!! Found the solution. The type of my applied fconst function was: t -> f b -> f b so I thought, I need something which uses the given f a (btw. in ghci type: ghci> :type \f g -> fconst f <$> b to check the type). Anyway.. scrolling up I recognized the type signatures of the lift functions. They take a function and create functors out of the parameters. So having a function fconst I thought why not:

(*>) f g = lift2 fconst f g
    where fconst _ x = x

and boom it worked! A nicer way however is to turn it into:

(*>) f g = lift2 (flip const) f g

and I can save the helper function.

Mon Sep 21

Made some great progress on the course. Got stuck now and then but help from Fraser Tweedale to get back on the moving tracks. Now I’m stuck again, although I feel the solution to be really simple. I’m implementing Apply for a sequence, discaring the value of the first argument, basically *>. So the function takes two Functors (f a and an f b). It returns the values of the second functor applied to the first, discarding the values of the first. Just doing (*>) _ g = id g produces the first mapping - Hurry. Now I need to map over the given (first) Functor f to produce a sequence of values from the functor b. Hmm…

Mon Sep 14

Started with the NICTA course. I wonder if you know before you learn something new how much theory you should learn before you apply it?

Sat Sep 12

One other item to find out is a functor instance of each AST type. Wouldn’t that mean I should be able to traverse the tree? I never tried the NICTA. Maybe I should.

Fri Sep 11

Turns out the idea doesn’t work as expected. Grouping will obviously only work on the columns of the first leafs of the AST. I would need to define a recursive function to recurse of the AST branch. Doing that I would need pattern matching, which means I could just group manually. Update: My expectations were slightly off about the basic blocks, however one additional problem with this approach is that leafs of the AST are encapsulated in their regarding data types. I’ll won’t get around using pattern matching. Doing that tho, returning a basic list of statements won’t work either. Since for a conditional for example, I don’t only need the statements inside the condition, but also the expressions of the actual if statement. Expressions are different type to statements so that won’t work. Which means creating new datatypes of my basic blocks. Doing that well… I could just use my types in hoopl. That means, I’ll need to go back to the drawing board on how I can create my datatype and make sure there is an edge between the basic blocks of the condition… One problem was, that during instantiation of my conditional type I have not had all the necessary information at hand. I need to re-visit this problem.

Mon Sep 7

Idea: in order to cut up the source code into block I try to get a list of basic block by looking at the indentation. That will still need some further mangling but it sounds promising.

Sat Sep 5

wow due to sickness and new project haven’t done anything on my project. I’ve added condition and now facing the problem of splitting up sequences of statements correctly up into blocks. I always wonder if I could do it simply based on the indentation level. I’ll start of with a quick Monad homework to pattern match and correctly bind sequences of values in order to end up with the right set of labels for each instruction. The condition is tricky, since I need a label at the start and a new one at the next normal instruction and then I need an edge between the two.

Sat 25 July

progress: I’m now constructing one CFG per module, not per function. That allowed me to remove quite a bit of stupid code. Using ‘stack’ more instead of cabal. something is odd with my dependencies: the tests can’t find HUnit sometimes. Found “tasty” as a framework for writing test. Will be trying that. The tests are currently becoming a bigger problem for verification of the program. After that we go for new instructions.

Fri Jul 17

Ok I’ve done it again: coding without brain. Zombie coding. Why am I not thinking about how to get the darn graph into a graphviz compatible graph. Here is an idea: I convert the graph to an inductive graph. That’s something Graphviz can already deal with. Still need to figure out how to number the darn nodes, but well… maybe I should use my brain. Brr.

Mon Jul 13

Again impasse… Just mucking through the graph doesn’t work. Extracting the nodes and then fiddling a graph together is shit. I need to better understand what I’m doing. My example perhaps is too simple. The control flow just flows through in the example. No jumps, no nothing. Debugging the example also revealed no entry and no exit. That is odd… Perhaps I should write a better test for this or look into coming up with a better example perhaps with a condition.

Fri Jul 10

Worked more on the visualisation. I first had a plan to create new edges by traversing the graph using the state monad. Then I realised, I could already use the created nodes as such instead of fiddling with the integers. Create a list of nodes and then number them. Today I went back to the state monad implementation which assigns numbers while traversing the graph. It doesn’t work as planned. Then creating the edges during traversing is a night mare. Esp. with BMiddle where I have no clue about which node could form an edge, since I have no edge throw a spanner in the works. Perhaps creating the flat list first with simply references is the better approach. I revisit that tomorrow again.

Tue Jul 7

Worked on the visualisation again. Hit a dead end with not being able to decide how I correctly deconstruct the blocks and what to do with the Normal constructors. There are no labels … should I re-use last integers again? Perhaps … I need to think about it.

Tue Jun 30

I’ve made a bigger jump yay!! I’m now almost 100% sure that I’ve made a mistake with the graph construction in the monadic code. It converted every source line into a separate graph with entry, middle and exit. I’ve fixed that. Now I can see that my support of a limited amount of statements obviously results in very boring graphs since no conditions or jumps are supported. What’s more - I’ve got the pretty printing right. That based on my smaller change before storing the actual AST nodes. With that I can pretty print the source code (unfortunately still indentation and line numbers missing). Maybe I could take a stab at visualising the graph now!!! w00000w /

Fri jun 26

Made small progress. I stabbed around in the generated graph with the fixtures I created. it seems that there is really nothing much to it if you don’t have branches or conditions. Because currently I only have entry, flow through and exit. Based on that assumption, I try again to visualise the graph with a state monad in which I keep track of the numbers I used peer label/block. The is more tho, since the visualised graph wouldn’t show any source code + pretty printing the code is another story.

Fri Jun 19

After some euphoria and progress with Graphviz the wall again. Transforming the hoopl graph into a graph I could use with Graphviz raised a few questions: one of them: how to I get the edges out? The graph feels really unwieldy… maybe I’m missing something. The fact that I’m tired does not really help.

Mon Jun 15

I’ve made some progress yay. I’ve re-factored my code and use a state transformer monad instead of what hoopl provides. That cleaned up some code and made it way more understandable. That allowed me to continue with the visualisation code. Unfortunately it produces no graph, only an empty white PNG :-{ If I wouldn’t be too sleepy, I’d be checking out what the heck is wrong, but not today.

Fri Jun 12

Arg.. just hitting a block again. My initial idea of painting the graph turned out to be the wall. I thought I could just take the labels indicating each block and … well create the graph. But how do I know the association? So I haven’t thought that far. All I get so far is a flat list of labels when I print the graph. That is not sufficient. One benefit though, I’ve got a better understanding about how the graph composes itself, so it’s easier to attach a meaning to what is describe in the paper. yay. Ok, one step further: dodged the problem when thinking about it. The information comes obviously from the types. A type of Block n O O means it’s open on entry and on exit: control flow flows through. That means it would be a pretty linear reference from one block to the next. So I could paint the graph. Here comes the but: the map I use for creating labels stores a mapping between source code and label. So much for looking up the source code given the label :( So.. next: I need another map storing the opposite.

Thu Jun 11

Sick the last couple of days. Made just a tiny fraction of progress, almost not worth to mention. I’m currently stuck mapping over my graph. I’ve found some sample code to study and I’m also studying example code from llvm-analsys. Great module.

Fri June 5

I’ve made some progress finally. I felt a bit demotivated after last sessions didn’t go anywhere. The labels are simply used to identify blocks. The annotated code fragments are kept in the map. That’s great. Now I need a better way of labeling my blocks. I’ve found another hoopl implementation in a package called: llvm-analysis. What’s more it also provides code to visualise the graph. I’ve played around with the Data.Graphviz package yesterday. I’m pushing forward with this. If I can visualise the graph I’m constructing, I’ll have the first module which can show a CFG for python programmes. Furthermore I can actually see what my code is doing. So I can feel my motivation creeping back in. Furthermore I’ve figured out that I need a control dependency graph for my programme as a starting point for slicing. How to obtain that? Well it looks that I can construct a CFG from a CDG. One other idea: it seems that I might be able to construct a CFG by just looking at the indentation (apart from loops and stuff). I should check this out.

Fri May 29

Hit a wall when I looked at another implementation which used hoopl here: http://nochair.net/posts/2013/02-20-a-hoopl-experience.html It might sound worse than it actually is tho. I’m close, but my implementation is basically wrong. What gets me though, that I’m still not clear where I go with these labels from hoopl. Sure the control flow graph is being created with the use of labels, but am can I use the CFG and see not the labels, but the source code identifiers? Almost think about going back to the coursera course, to learn more haskell, before continuing. There is a lot of stuff I theoretically understand, but practice is wonky. I can’t apply it correctly.

Fri May 22

Thought more over it over night and came to the conclusion that my approach of converting a language last to an IR will not produce a control flow graph I can use for slicing and return a path to be visualised in my editor unless I operate on exactly the same AST. I think it also finally clicked how to use function composition (.). The function you compose takes a function instead of a value. Then you can compose it.

Thur May 21

I’ve put more work into studying hoopl for my programslice. I’m lost. Not sure if I should go through an IR or use hoopl directly with the IR. It seems it’s all due my ignorance. I put using IR aside and see how using hoopl with the python ast will work out.

Tue May 12

Want to try the converter out and can’t get the damn thing to run. I thought I know how to get a computation out of a monad, but it doesn’t seem to work if I’m in another monad (IO that is). Will have to read more on it.

Sun May 10

Worked more on the AST to IR converter. Didn’t get much done. Have to be careful to keep the “smallest” item for the graph to be the function. That means I can ignore a few statements and expressions.

Sat May 9

Working on programslice. Halfway through the converter to an intermediate representation. Good thing: I understand to 90% what’s going on from the hoopl example. The monadic code is a bit of a grey area. Bad thing: the language annotation (column, line numbers) which is needed for a slice result is gone. Currently not anticipated. So I’ll have to find a solution to keep that somewhere. Not sure if with a growing number of languages the IR is not turning out to be a PITA. let’s see…

Tue May 5

I’ve got the arbitrary instances fixed so that they compile. The property test I’ve implemented also runs in the module I’ve moved it in. Great!!! The only problem is, that I expected the test to fail since it’s utter garbage I have it testing there :{ Something I need to figure out what’s wrong tomorrow. Things to note: Implement the arbitrary instances for each atomic part of the program (e.g. I needed arbitrary instances for my Edges first before I write and test the graph).

Tue May 5

Having another go at the Arbitrary implementation for Graph and Edges. I’m going nowhere. With the Edge which now is an actual data type, I can’t understand how I can construct an “arbitrary” Edge for the vertices and the weight. So how do I provide two different types to create my Edge? Graph is another beast, but slightly different. First blocker was the fact that it is just a type alias. No, I’m not implementing another Hashmap. With language pragmas I got further, but now I don’t know how to code around my generic data types I’m using for the instance header. Maybe I should just stick with concrete types for the moment?

Sat May 2

I’m in the process of changing the graph to be a weighted graph. After the first hurdle it was actually surprisingly easy and straight forward to deal with the fact that I’m now storing complete edges with a weight instead of just a list (neighbours). So that’s were data types are becoming helpful: encapsulation. The tests needed more work, but nothing bad. Found that cabal has a way to execute a test suite. Nice.

Wer April 29

Interestingly enough the Data.Graph topological sort can sort even DAGs which is by definition not possible. It seems that is due to the fact that it does the sorting on a forest which is by definition undirected. My recursive implementation resulted in an infinite recursion when confronted with a cyclic directed graph. Changing the implementation to just skip visited notes results in an ordering, but it shouldn’t be. That corresponds to skipping an edge which is actually not what I want here. Using Maybe doesn’t seem to make the code more readable. Bleh :-( Moving on to minimum spanning trees I got stuck rewiring my implementation to accept a weight for my edge.

Tue April 28

Caught a cold over the weekend so couldn’t hack. Interesting findings I’ve made: I’ve compared my implementation with the one in Data.Graph. Even tho mine is far inferior, I now understand Data.Graph wayyy better. I also realise where the data types come into play. For one it’s good documentation. The dfs implementation in Data.Graph returns a forest of vertices which is an undirected graph of trees. working with this type as a result is far more elegant than lists. Furthermore it seems straight forward with the immutable data types.

Thu April 23

Darn week almost over again. No coursera today but hacking on my uvmeter parsec parser. Got it parsing with some help from an online article. Need to hook it up tomorrow and let it go for a spin. Only missing thing: not using a parser combinator. Heck no idea wtf it actually is.

Wed April 22

Started with parsec to replace the conduit library in my uvmeter patch. Nothing to report there. On the coursera front I think I figured why I had trouble with the cycles. The course and slides kind of skip through the ordering and I thought topological sort is all. But turns out you get to it via creating first a postorder.

Tue April 21

Implemented strongly connected components. Took a stab by re-using my code I implemented for topological sort order, which is in the second stage almost the same. Exception is to create a map for the look up. It worked with my graph which had no cycles, but failed with sunils. I’m a bit puzzled, since even the example in the lecture seem to have cycles. Looks like I’ll have to figure out first what it means to have or not have cycles. Seemed to be straight forward to me, but it seems now it isn’t.

Mon April 20 (2)

I’ve got it! I’ve got it!!!! Got him by the balls. Yess. My mistake was to always concatenate the result list of the first search branch to the rest of the search (the search of the other branches). That concatenated the end result on to the result I had previously resulting in an infinite loop. Instead I should have just used the darn result of the first recursive call in the call of the rest of the list. Dang!

Mon April 20

I think I have to start over again. I’m carrying 4 lists around. Haven’t had much time to debug what is going wrong. This is driving me crazy.

sund April 19

Figured today that my implemented “solution” was all wrong. Used a bigger directed graph and nothing was ok. I’ve tinkered more and realized that i needed a stack in order to tie up the remaining dfs nodes I run over with. That’s what the learning material was talking about all the time. Brr. His did I miss this obvious thing? Now it’s working for one branch. I hope I can find the missing piece to make it work over the while graph.

sat April 18

Today i continued with my topological sorting. I can’t believe that i spent the last the days brainlessly making the same mistakes iterating over one recursive function which should be two. I broke the problem down to parts: get the last result of a dfs, the the whole branch, now applied consecutively. Last step is to hook it up so that I run it over all vertices. I’ve made the same mistake when implementing connected components: too much at the same time. The tricky thing is thinking recursively while also thinking about how to update your immutable data structures. In imperative languages you can mutate wherever you want.