Tuesday, August 6, 2019

wacky joins of flat json data and dumping to excel

Just keeping this around for my own reference, it might be too esoteric for anyone else.

I use a filesystem based database for my Porchfest work: one folder roughly equals a table, and each json file corresponds to a row, and the filename acts as the key. (and each file is a simple map, where the keys correspond to column names and the values are the column values)

So I have one folder for porches, one for bands, and one for "gigs", which is one to many join table for porches going to bands, along with what time.

My porchfest runner asked if I could dump out the porch / band / what time info to a flatfile (or rather, to a dirty "tab delimited file pretending to be a spreadsheet" that excel lets me get away with)

The format she wanted was a little idiosyncratic: one row per porch, and then each band and gig kind of flattened on the same row-- like if I was doing it on my own I might have done it


so that columns were "clean" and nothing was duplicated (but different rows mean different things) or possibly
where columns are still clean, but information is duplicated...

She asked for something more like

I personally find it less intuitive, since different porches have different numbers of bands, and columns have to be "sort of" duplicated etc, but I kind of get it... I'm thinking too much like a computer and not an excel user :-D

Anyway, here's the code that does it.  I read in all the metadata as GET parameters (though the commented out variables give a feel for the values I'm passing in ), I build a top row as header values, and then if I'm passed a file name I use that dirty old php to excel hack otherwise I just dump the info to the browser <pre> tags

<?php
/*
    prints out select contents of three tables assuming a one to many relationship,

   print out will be 
      leftdata rightdata1 joindata1 rightdata2 joindata2
        in practice:
        porch1 band1 gig1 band2 gig2
        porch2 band3 gig3

*/
        /*
        $lefttable = "porch";
        $righttable = "band";
        $jointable = "gig";
        $leftjoinfield = "porchid";
        $rightjoinfield = "bandid";
        $leftfields = "address|spacetype|zone|capacity|raindateable|notes|maincontact|email";
        $rightfields = "bandname|email|howloud|needelectricity|bandsize|candoraindate|preferredstarttime|performancelength";
        $joinfields = "starthour|startminute";
        #$leftfields = "address";
        #$rightfields = "bandname";
        #$joinfields = "starthour|startminute";
        */
        $lefttable = $_GET["lefttable"];
        $righttable = $_GET["righttable"]; 
        $jointable = $_GET["jointable"];
        $leftjoinfield = $_GET["leftjoinfield"];
        $rightjoinfield = $_GET["rightjoinfield"];
        $leftfields = $_GET["leftfields"];
        $rightfields = $_GET["rightfields"];
        $joinfields = $_GET["joinfields"];
        
        $filename = isset($_GET["filename"]) ? $_GET["filename"] : "";    
        
        $leftguts = getGuts($lefttable);
        $rightguts = getGuts($righttable);
        $joinguts = getGuts($jointable);
        
        $buf = "";
    
        $maxjoins = 0;
        
        
        foreach($leftguts as $leftkey=>$left) {
            getValuesForKeysInString($left,$leftfields)  ;
            $joinmatchkeys = array();
            foreach($joinguts as $joinkey => $join){ //go over all keys in join table
                if($join[$leftjoinfield] == $leftkey) { //if this is a join
                  array_push($joinmatchkeys,$joinkey);
                }
            }
            if(count($joinmatchkeys) > $maxjoins) {
                   $maxjoins = count($joinmatchkeys);
            }
            foreach($joinmatchkeys as $joinmatchkey) {
                $rightkey = $joinguts[$joinmatchkey][$rightjoinfield];
                #$buf .= "$rightkey\t";
                getValuesForKeysInString($rightguts[$rightkey],$rightfields)  ;
                getValuesForKeysInString($joinguts[$joinmatchkey],$joinfields)  ;
            }
            
            $buf .= "\n";
                
        }                

        
        $headerbuf = "";
        foreach(explode("|",$leftfields) as $leftfield) {
            $headerbuf .=  "$leftfield\t";   
        }
        for($i = 0; $i < $maxjoins; $i++){
            foreach(explode("|",$rightfields) as $rightfield) {
                $headerbuf .=   "$rightfield\t";    
            }
            foreach(explode("|",$joinfields) as $joinfield) {
                $headerbuf .=   "$joinfield\t";    
            }
        }
        $headerbuf .= "\n";
        
        $buf = $headerbuf.$buf;
        
        if(! $filename) {
           print "<pre>$buf</pre>";        
        } else {
            header("Content-Disposition: attachment; filename=\"$filename.xls\""); 
            header("Content-Type: application/vnd.ms-excel");
            print "$buf";
        }
?><?
        function getValuesForKeysInString($guts,$keystring){
            global $buf;
            foreach(explode("|",$keystring) as $key) {
                $val = $guts[$key];
                $val = str_replace("\t"," ",$val);
                $val = str_replace("\n"," ",$val);
                $val = str_replace("\r"," ",$val);
                $buf .= $val . "\t";
            }
            
        }

        function getGuts($tablename) {
            global $dbroot;
            $guts = array();
            $path = $dbroot."/db/".$tablename."/";
            $files = scandir($path);
            foreach($files as $file){
                if(substr($file, 0, 1) != "."){ //ignore hidden files
                    $rowguts = json_decode(file_get_contents("$path/$file"),true);
                    $guts[$file] = $rowguts;
                }
            }
            return $guts;
        }
?>

No comments:

Post a Comment