scRNA-seq_analysis

2024-10-23 08:29:24 -07:00 · 2019-07-08 12:22:01 +01:00 · 2019-07-08 12:22:01 +01:00 · 82cc2d191e
commit 82cc2d191e
188 changed files with 146184 additions and 0 deletions
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/README.md
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/README.md
@ -0,0 +1,154 @@
+## ForceAtlas2 for Python
+
+A port of Gephi's Force Atlas 2 layout algorithm to Python 2 and Python 3 (with a wrapper for NetworkX and igraph). This is the fastest python implementation available with most of the features complete. It also supports Barnes Hut approximation for maximum speedup.
+
+ForceAtlas2 is a very fast layout algorithm for force-directed graphs. It's used to spatialize a **weighted undirected** graph in 2D (Edge weight defines the strength of the connection). The implementation is based on this [paper](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0098679) and the corresponding [gephi-java-code](https://github.com/gephi/gephi/blob/master/modules/LayoutPlugin/src/main/java/org/gephi/layout/plugin/forceAtlas2/ForceAtlas2.java). Its really quick compared to the fruchterman reingold algorithm (spring layout) of networkx and scales well to high number of nodes (>10000).
+
+<p align="center" text-align="center">
+    <b>Spatialize a random Geometric Graph</b>
+</p>
+<p align="center">
+  <img width="460" height="300" src="https://raw.githubusercontent.com/bhargavchippada/forceatlas2/master/examples/geometric_graph.png" alt="Geometric Graph">
+</p>
+
+## Installation
+
+Install from pip:
+
+    pip install fa2
+
+To build and install run from source:
+
+    python setup.py install
+
+**Cython is highly recommended if you are buidling from source as it will speed up by a factor of 10-100x depending on the graph**
+
+### Dependencies
+
+-   numpy (adjacency matrix as complete matrix)
+-   scipy (adjacency matrix as sparse matrix)
+-   tqdm (progressbar)
+-   Cython (10-100x speedup)
+-   networkx (To use the NetworkX wrapper function, you obviously need NetworkX)
+-   python-igraph (To use the igraph wrapper)
+
+<p align="center" text-align="center">
+    <b>Spatialize a 2D Grid</b>
+</p>
+<p align="center">
+  <img width="460" height="300" src="https://raw.githubusercontent.com/bhargavchippada/forceatlas2/master/examples/grid_graph.png" alt="Grid Graph">
+</p>
+
+## Usage
+
+from fa2 import ForceAtlas2
+
+Create a ForceAtlas2 object with the appropriate settings. ForceAtlas2 class contains three important methods:
+```python
+forceatlas2 (G, pos, iterations)
+# G is a graph in 2D numpy ndarray format (or) scipy sparse matrix format. You can set the edge weights (> 0) in the matrix
+# pos is a numpy array (Nx2) of initial positions of nodes
+# iterations is num of iterations to run the algorithm
+# returns a list of (x,y) pairs for each node's final position
+```
+```python
+forceatlas2_networkx_layout(G, pos, iterations)
+# G is a networkx graph. Edge weights can be set (if required) in the Networkx graph
+# pos is a dictionary, as in networkx
+# iterations is num of iterations to run the algorithm
+# returns a dictionary of node positions (2D X-Y tuples) indexed by the node name
+```
+```python
+forceatlas2_igraph_layout(G, pos, iterations, weight_attr)
+# G is an igraph graph
+# pos is a numpy array (Nx2) or list of initial positions of nodes (see that the indexing matches igraph node index)
+# iterations is num of iterations to run the algorithm
+# weight_attr denotes the weight attribute's name in G.es, None by default
+# returns an igraph layout
+```
+Below is an example usage. You can also see the feature settings of ForceAtlas2 class.
+
+```python
+import networkx as nx
+from fa2 import ForceAtlas2
+import matplotlib.pyplot as plt
+
+G = nx.random_geometric_graph(400, 0.2)
+
+forceatlas2 = ForceAtlas2(
+                        # Behavior alternatives
+                        outboundAttractionDistribution=True,  # Dissuade hubs
+                        linLogMode=False,  # NOT IMPLEMENTED
+                        adjustSizes=False,  # Prevent overlap (NOT IMPLEMENTED)
+                        edgeWeightInfluence=1.0,
+
+                        # Performance
+                        jitterTolerance=1.0,  # Tolerance
+                        barnesHutOptimize=True,
+                        barnesHutTheta=1.2,
+                        multiThreaded=False,  # NOT IMPLEMENTED
+
+                        # Tuning
+                        scalingRatio=2.0,
+                        strongGravityMode=False,
+                        gravity=1.0,
+
+                        # Log
+                        verbose=True)
+
+positions = forceatlas2.forceatlas2_networkx_layout(G, pos=None, iterations=2000)
+nx.draw_networkx_nodes(G, positions, node_size=20, with_labels=False, node_color="blue", alpha=0.4)
+nx.draw_networkx_edges(G, positions, edge_color="green", alpha=0.05)
+plt.axis('off')
+plt.show()
+
+# equivalently
+import igraph
+G = igraph.Graph.TupleList(G.edges(), directed=False)
+layout = forceatlas2.forceatlas2_igraph_layout(G, pos=None, iterations=2000)
+igraph.plot(G, layout).show()
+```
+You can also take a look at forceatlas2.py file for understanding the ForceAtlas2 class and its functions better.
+
+## Features Completed
+
+-   **barnesHutOptimize**: Barnes Hut optimization, n<sup>2</sup> complexity to n.ln(n)
+-   **gravity**: Attracts nodes to the center. Prevents islands from drifting away
+-   **Dissuade Hubs**: Distributes attraction along outbound edges. Hubs attract less and thus are pushed to the borders
+-   **scalingRatio**: How much repulsion you want. More makes a more sparse graph
+-   **strongGravityMode**: A stronger gravity view
+-   **jitterTolerance**: How much swinging you allow. Above 1 discouraged. Lower gives less speed and more precision
+-   **verbose**: Shows a progressbar of iterations completed. Also, shows time taken for different force computations
+-   **edgeWeightInfluence**: How much influence you give to the edges weight. 0 is "no influence" and 1 is "normal"
+
+## Documentation
+
+You will find all the documentation in the source code
+
+## Contributors
+
+Contributions are highly welcome. Please submit your pull requests and become a collaborator.
+
+## Copyright
+
+    Copyright (C) 2017 Bhargav Chippada bhargavchippada19@gmail.com.
+    Licensed under the GNU GPLv3.
+
+The files are heavily based on the java files included in Gephi, git revision 2b9a7c8 and Max Shinn's port to python of the algorithm. Here I include the copyright information from those files:
+
+    Copyright 2008-2011 Gephi
+    Authors : Mathieu Jacomy <mathieu.jacomy@gmail.com>
+    Website : http://www.gephi.org
+    Copyright 2011 Gephi Consortium. All rights reserved.
+    Portions Copyrighted 2011 Gephi Consortium.
+    The contents of this file are subject to the terms of either the
+    GNU General Public License Version 3 only ("GPL") or the Common
+    Development and Distribution License("CDDL") (collectively, the
+    "License"). You may not use this file except in compliance with
+    the License.
+
+    <https://github.com/mwshinn/forceatlas2-python>
+    Copyright 2016 Max Shinn <mws41@cam.ac.uk>
+    Available under the GPLv3
+
+    Also, thanks to Eugene Bosiakov <https://github.com/bosiakov/fa2l>
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/disclaimer.txt
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/disclaimer.txt
@ -0,0 +1,5 @@
+Package downloaded from https://github.com/bhargavchippada/forceatlas2
+forceatlas2.py has been modified and it is different from the original script.
+The modification allows for returning all FDG coordinates for each iteration. This is needed for the creation of animated force directed graph.
+
+It is the understanding of the person (Dorin-Mirel Popescu) who modified the published package that forceatlas2 is subjected to GPL version 3 terms which allows for modifications of original code and publishing the modified version. The original author of forceatlas2 (Mathieu Jacomy) is acknowledged. Furthermore the modifications within this version do not pertain to the algorithm but only functionalities that allow for keeping all transient states for the purpose of tracking the evolution of force directed graph visualised in a video format.
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/fa2/init.py
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/fa2/init.py
@ -0,0 +1 @@
+from .forceatlas2 import *
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/fa2/fa2util.c
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/fa2/fa2util.c
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/fa2/fa2util.pxd
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/fa2/fa2util.pxd
@ -0,0 +1,122 @@
+# Cython optimizations.  Cython allows huge speed boosts by giving
+# each variable a type.  This file is called a "pxd extension file"
+# (see the "Pure Python" section of the Cython manual).  In essence,
+# it provides types for function definitions and then, if cython is
+# available, it uses these types to optimize normal python code.  It
+# is associated with the fa2util.py file.
+#
+# IF ANY CHANGES ARE MADE TO fa2util.py, THE CHANGES MUST BE REFLECTED
+# HERE!!
+#
+# Copyright (C) 2017 Bhargav Chippada <bhargavchippada19@gmail.com>
+#
+# Available under the GPLv3
+
+import cython
+
+# This will substitute for the nLayout object
+cdef class Node:
+    cdef public double mass
+    cdef public double old_dx, old_dy
+    cdef public double dx, dy
+    cdef public double x, y
+
+# This is not in the original java function, but it makes it easier to
+# deal with edges.
+cdef class Edge:
+    cdef public int node1, node2
+    cdef public double weight
+
+# Repulsion function.  `n1` and `n2` should be nodes.  This will
+# adjust the dx and dy values of `n1` (and optionally `n2`).  It does
+# not return anything.
+
+@cython.locals(xDist = cython.double, 
+               yDist = cython.double, 
+               distance2 = cython.double, 
+               factor = cython.double)
+cdef void linRepulsion(Node n1, Node n2, double coefficient=*)
+
+@cython.locals(xDist = cython.double,
+               yDist = cython.double,
+               distance2 = cython.double,
+               factor = cython.double)
+cdef void linRepulsion_region(Node n, Region r, double coefficient=*)
+
+
+@cython.locals(xDist = cython.double, 
+               yDist = cython.double, 
+               distance = cython.double, 
+               factor = cython.double)
+cdef void linGravity(Node n, double g)
+
+
+@cython.locals(xDist = cython.double, 
+               yDist = cython.double, 
+               factor = cython.double)
+cdef void strongGravity(Node n, double g, double coefficient=*)
+
+@cython.locals(xDist = cython.double, 
+               yDist = cython.double, 
+               factor = cython.double)
+cpdef void linAttraction(Node n1, Node n2, double e, bint distributedAttraction, double coefficient=*)
+
+@cython.locals(i = cython.int,
+               j = cython.int,
+               n1 = Node,
+               n2 = Node)
+cpdef void apply_repulsion(list nodes, double coefficient)
+
+@cython.locals(n = Node)
+cpdef void apply_gravity(list nodes, double gravity, bint useStrongGravity=*)
+
+@cython.locals(edge = Edge)
+cpdef void apply_attraction(list nodes, list edges, bint distributedAttraction, double coefficient, double edgeWeightInfluence)
+
+cdef class Region:
+    cdef public double mass
+    cdef public double massCenterX, massCenterY
+    cdef public double size
+    cdef public list nodes
+    cdef public list subregions
+
+    @cython.locals(massSumX = cython.double,
+                   massSumY = cython.double,
+                   n = Node,
+                   distance = cython.double)
+    cdef void updateMassAndGeometry(self)
+
+    @cython.locals(n = Node,
+                   leftNodes = list,
+                   rightNodes = list,
+                   topleftNodes = list,
+                   bottomleftNodes = list,
+                   toprightNodes = list,
+                   bottomrightNodes = list,
+                   subregion = Region)
+    cpdef void buildSubRegions(self)
+
+
+    @cython.locals(distance = cython.double,
+                   subregion = Region)
+    cdef void applyForce(self, Node n, double theta, double coefficient=*)
+
+    @cython.locals(n = Node)
+    cpdef applyForceOnNodes(self, list nodes, double theta, double coefficient=*)
+
+@cython.locals(totalSwinging = cython.double,
+               totalEffectiveTraction = cython.double,
+               n = Node,
+               swinging = cython.double,
+               totalSwinging = cython.double,
+               totalEffectiveTraction = cython.double,
+               estimatedOptimalJitterTolerance = cython.double,
+               minJT = cython.double,
+               maxJT = cython.double,
+               jt = cython.double,
+               minSpeedEfficiency = cython.double,
+               targetSpeed = cython.double,
+               maxRise = cython.double,
+               factor = cython.double,
+               values = dict)
+cpdef dict adjustSpeedAndApplyForces(list nodes, double speed, double speedEfficiency, double jitterTolerance)
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/fa2/fa2util.py
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/fa2/fa2util.py
@ -0,0 +1,326 @@
+# This file allows separating the most CPU intensive routines from the
+# main code.  This allows them to be optimized with Cython.  If you
+# don't have Cython, this will run normally.  However, if you use
+# Cython, you'll get speed boosts from 10-100x automatically.
+#
+# THE ONLY CATCH IS THAT IF YOU MODIFY THIS FILE, YOU MUST ALSO MODIFY
+# fa2util.pxd TO REFLECT ANY CHANGES IN FUNCTION DEFINITIONS!
+#
+# Copyright (C) 2017 Bhargav Chippada <bhargavchippada19@gmail.com>
+#
+# Available under the GPLv3
+
+from math import sqrt
+
+
+# This will substitute for the nLayout object
+class Node:
+    def __init__(self):
+        self.mass = 0.0
+        self.old_dx = 0.0
+        self.old_dy = 0.0
+        self.dx = 0.0
+        self.dy = 0.0
+        self.x = 0.0
+        self.y = 0.0
+
+
+# This is not in the original java code, but it makes it easier to deal with edges
+class Edge:
+    def __init__(self):
+        self.node1 = -1
+        self.node2 = -1
+        self.weight = 0.0
+
+
+# Here are some functions from ForceFactory.java
+# =============================================
+
+# Repulsion function.  `n1` and `n2` should be nodes.  This will
+# adjust the dx and dy values of `n1`  `n2`
+def linRepulsion(n1, n2, coefficient=0):
+    xDist = n1.x - n2.x
+    yDist = n1.y - n2.y
+    distance2 = xDist * xDist + yDist * yDist  # Distance squared
+
+    if distance2 > 0:
+        factor = coefficient * n1.mass * n2.mass / distance2
+        n1.dx += xDist * factor
+        n1.dy += yDist * factor
+        n2.dx -= xDist * factor
+        n2.dy -= yDist * factor
+
+
+# Repulsion function. 'n' is node and 'r' is region
+def linRepulsion_region(n, r, coefficient=0):
+    xDist = n.x - r.massCenterX
+    yDist = n.y - r.massCenterY
+    distance2 = xDist * xDist + yDist * yDist
+
+    if distance2 > 0:
+        factor = coefficient * n.mass * r.mass / distance2
+        n.dx += xDist * factor
+        n.dy += yDist * factor
+
+
+# Gravity repulsion function.  For some reason, gravity was included
+# within the linRepulsion function in the original gephi java code,
+# which doesn't make any sense (considering a. gravity is unrelated to
+# nodes repelling each other, and b. gravity is actually an
+# attraction)
+def linGravity(n, g):
+    xDist = n.x
+    yDist = n.y
+    distance = sqrt(xDist * xDist + yDist * yDist)
+
+    if distance > 0:
+        factor = n.mass * g / distance
+        n.dx -= xDist * factor
+        n.dy -= yDist * factor
+
+
+# Strong gravity force function. `n` should be a node, and `g`
+# should be a constant by which to apply the force.
+def strongGravity(n, g, coefficient=0):
+    xDist = n.x
+    yDist = n.y
+
+    if xDist != 0 and yDist != 0:
+        factor = coefficient * n.mass * g
+        n.dx -= xDist * factor
+        n.dy -= yDist * factor
+
+
+# Attraction function.  `n1` and `n2` should be nodes.  This will
+# adjust the dx and dy values of `n1` and `n2`.  It does
+# not return anything.
+def linAttraction(n1, n2, e, distributedAttraction, coefficient=0):
+    xDist = n1.x - n2.x
+    yDist = n1.y - n2.y
+    if not distributedAttraction:
+        factor = -coefficient * e
+    else:
+        factor = -coefficient * e / n1.mass
+    n1.dx += xDist * factor
+    n1.dy += yDist * factor
+    n2.dx -= xDist * factor
+    n2.dy -= yDist * factor
+
+
+# The following functions iterate through the nodes or edges and apply
+# the forces directly to the node objects.  These iterations are here
+# instead of the main file because Python is slow with loops.
+def apply_repulsion(nodes, coefficient):
+    i = 0
+    for n1 in nodes:
+        j = i
+        for n2 in nodes:
+            if j == 0:
+                break
+            linRepulsion(n1, n2, coefficient)
+            j -= 1
+        i += 1
+
+
+def apply_gravity(nodes, gravity, useStrongGravity=False):
+    if not useStrongGravity:
+        for n in nodes:
+            linGravity(n, gravity)
+    else:
+        for n in nodes:
+            strongGravity(n, gravity)
+
+
+def apply_attraction(nodes, edges, distributedAttraction, coefficient, edgeWeightInfluence):
+    # Optimization, since usually edgeWeightInfluence is 0 or 1, and pow is slow
+    if edgeWeightInfluence == 0:
+        for edge in edges:
+            linAttraction(nodes[edge.node1], nodes[edge.node2], 1, distributedAttraction, coefficient)
+    elif edgeWeightInfluence == 1:
+        for edge in edges:
+            linAttraction(nodes[edge.node1], nodes[edge.node2], edge.weight, distributedAttraction, coefficient)
+    else:
+        for edge in edges:
+            linAttraction(nodes[edge.node1], nodes[edge.node2], pow(edge.weight, edgeWeightInfluence),
+                          distributedAttraction, coefficient)
+
+
+# For Barnes Hut Optimization
+class Region:
+    def __init__(self, nodes):
+        self.mass = 0.0
+        self.massCenterX = 0.0
+        self.massCenterY = 0.0
+        self.size = 0.0
+        self.nodes = nodes
+        self.subregions = []
+        self.updateMassAndGeometry()
+
+    def updateMassAndGeometry(self):
+        if len(self.nodes) > 1:
+            self.mass = 0
+            massSumX = 0
+            massSumY = 0
+            for n in self.nodes:
+                self.mass += n.mass
+                massSumX += n.x * n.mass
+                massSumY += n.y * n.mass
+            self.massCenterX = massSumX / self.mass
+            self.massCenterY = massSumY / self.mass
+
+            self.size = 0.0
+            for n in self.nodes:
+                distance = sqrt((n.x - self.massCenterX) ** 2 + (n.y - self.massCenterY) ** 2)
+                self.size = max(self.size, 2 * distance)
+
+    def buildSubRegions(self):
+        if len(self.nodes) > 1:
+
+            leftNodes = []
+            rightNodes = []
+            for n in self.nodes:
+                if n.x < self.massCenterX:
+                    leftNodes.append(n)
+                else:
+                    rightNodes.append(n)
+
+            topleftNodes = []
+            bottomleftNodes = []
+            for n in leftNodes:
+                if n.y < self.massCenterY:
+                    topleftNodes.append(n)
+                else:
+                    bottomleftNodes.append(n)
+
+            toprightNodes = []
+            bottomrightNodes = []
+            for n in rightNodes:
+                if n.y < self.massCenterY:
+                    toprightNodes.append(n)
+                else:
+                    bottomrightNodes.append(n)
+
+            if len(topleftNodes) > 0:
+                if len(topleftNodes) < len(self.nodes):
+                    subregion = Region(topleftNodes)
+                    self.subregions.append(subregion)
+                else:
+                    for n in topleftNodes:
+                        subregion = Region([n])
+                        self.subregions.append(subregion)
+
+            if len(bottomleftNodes) > 0:
+                if len(bottomleftNodes) < len(self.nodes):
+                    subregion = Region(bottomleftNodes)
+                    self.subregions.append(subregion)
+                else:
+                    for n in bottomleftNodes:
+                        subregion = Region([n])
+                        self.subregions.append(subregion)
+
+            if len(toprightNodes) > 0:
+                if len(toprightNodes) < len(self.nodes):
+                    subregion = Region(toprightNodes)
+                    self.subregions.append(subregion)
+                else:
+                    for n in toprightNodes:
+                        subregion = Region([n])
+                        self.subregions.append(subregion)
+
+            if len(bottomrightNodes) > 0:
+                if len(bottomrightNodes) < len(self.nodes):
+                    subregion = Region(bottomrightNodes)
+                    self.subregions.append(subregion)
+                else:
+                    for n in bottomrightNodes:
+                        subregion = Region([n])
+                        self.subregions.append(subregion)
+
+            for subregion in self.subregions:
+                subregion.buildSubRegions()
+
+    def applyForce(self, n, theta, coefficient=0):
+        if len(self.nodes) < 2:
+            linRepulsion(n, self.nodes[0], coefficient)
+        else:
+            distance = sqrt((n.x - self.massCenterX) ** 2 + (n.y - self.massCenterY) ** 2)
+            if distance * theta > self.size:
+                linRepulsion_region(n, self, coefficient)
+            else:
+                for subregion in self.subregions:
+                    subregion.applyForce(n, theta, coefficient)
+
+    def applyForceOnNodes(self, nodes, theta, coefficient=0):
+        for n in nodes:
+            self.applyForce(n, theta, coefficient)
+
+
+# Adjust speed and apply forces step
+def adjustSpeedAndApplyForces(nodes, speed, speedEfficiency, jitterTolerance):
+    # Auto adjust speed.
+    totalSwinging = 0.0  # How much irregular movement
+    totalEffectiveTraction = 0.0  # How much useful movement
+    for n in nodes:
+        swinging = sqrt((n.old_dx - n.dx) * (n.old_dx - n.dx) + (n.old_dy - n.dy) * (n.old_dy - n.dy))
+        totalSwinging += n.mass * swinging
+        totalEffectiveTraction += .5 * n.mass * sqrt(
+            (n.old_dx + n.dx) * (n.old_dx + n.dx) + (n.old_dy + n.dy) * (n.old_dy + n.dy))
+
+    # Optimize jitter tolerance.  The 'right' jitter tolerance for
+    # this network. Bigger networks need more tolerance. Denser
+    # networks need less tolerance. Totally empiric.
+    estimatedOptimalJitterTolerance = .05 * sqrt(len(nodes))
+    minJT = sqrt(estimatedOptimalJitterTolerance)
+    maxJT = 10
+    jt = jitterTolerance * max(minJT,
+                               min(maxJT, estimatedOptimalJitterTolerance * totalEffectiveTraction / (
+                                   len(nodes) * len(nodes))))
+
+    minSpeedEfficiency = 0.05
+
+    # Protective against erratic behavior
+    if totalSwinging / totalEffectiveTraction > 2.0:
+        if speedEfficiency > minSpeedEfficiency:
+            speedEfficiency *= .5
+        jt = max(jt, jitterTolerance)
+
+    if totalSwinging == 0:
+        targetSpeed = float('inf')
+    else:
+        targetSpeed = jt * speedEfficiency * totalEffectiveTraction / totalSwinging
+
+    if totalSwinging > jt * totalEffectiveTraction:
+        if speedEfficiency > minSpeedEfficiency:
+            speedEfficiency *= .7
+    elif speed < 1000:
+        speedEfficiency *= 1.3
+
+    # But the speed shoudn't rise too much too quickly, since it would
+    # make the convergence drop dramatically.
+    maxRise = .5
+    speed = speed + min(targetSpeed - speed, maxRise * speed)
+
+    # Apply forces.
+    #
+    # Need to add a case if adjustSizes ("prevent overlap") is
+    # implemented.
+    for n in nodes:
+        swinging = n.mass * sqrt((n.old_dx - n.dx) * (n.old_dx - n.dx) + (n.old_dy - n.dy) * (n.old_dy - n.dy))
+        factor = speed / (1.0 + sqrt(speed * swinging))
+        n.x = n.x + (n.dx * factor)
+        n.y = n.y + (n.dy * factor)
+
+    values = {}
+    values['speed'] = speed
+    values['speedEfficiency'] = speedEfficiency
+
+    return values
+
+
+try:
+    import cython
+
+    if not cython.compiled:
+        print("Warning: uncompiled fa2util module.  Compile with cython for a 10-100x speed boost.")
+except:
+    print("No cython detected.  Install cython and compile the fa2util module for a 10-100x speed boost.")
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/fa2/forceatlas2.py
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/fa2/forceatlas2.py
@ -0,0 +1,250 @@
+# This is the fastest python implementation of the ForceAtlas2 plugin from Gephi
+# intended to be used with networkx, but is in theory independent of
+# it since it only relies on the adjacency matrix.  This
+# implementation is based directly on the Gephi plugin:
+#
+# https://github.com/gephi/gephi/blob/master/modules/LayoutPlugin/src/main/java/org/gephi/layout/plugin/forceAtlas2/ForceAtlas2.java
+#
+# For simplicity and for keeping code in sync with upstream, I have
+# reused as many of the variable/function names as possible, even when
+# they are in a more java-like style (e.g. camelcase)
+#
+# I wrote this because I wanted an almost feature complete and fast implementation
+# of ForceAtlas2 algorithm in python
+#
+# NOTES: Currently, this only works for weighted undirected graphs.
+#
+# Copyright (C) 2017 Bhargav Chippada <bhargavchippada19@gmail.com>
+#
+# Available under the GPLv3
+
+import random
+import time
+import numpy as np
+
+import numpy
+import scipy
+from tqdm import tqdm
+
+from . import fa2util
+
+
+class Timer:
+    def __init__(self, name="Timer"):
+        self.name = name
+        self.start_time = 0.0
+        self.total_time = 0.0
+
+    def start(self):
+        self.start_time = time.time()
+
+    def stop(self):
+        self.total_time += (time.time() - self.start_time)
+
+    def display(self):
+        print(self.name, " took ", "%.2f" % self.total_time, " seconds")
+
+
+class ForceAtlas2:
+    def __init__(self,
+                 # Behavior alternatives
+                 outboundAttractionDistribution=False,  # Dissuade hubs
+                 linLogMode=False,  # NOT IMPLEMENTED
+                 adjustSizes=False,  # Prevent overlap (NOT IMPLEMENTED)
+                 edgeWeightInfluence=1.0,
+
+                 # Performance
+                 jitterTolerance=1.0,  # Tolerance
+                 barnesHutOptimize=True,
+                 barnesHutTheta=1.2,
+                 multiThreaded=False,  # NOT IMPLEMENTED
+
+                 # Tuning
+                 scalingRatio=2.0,
+                 strongGravityMode=False,
+                 gravity=1.0,
+
+                 # Log
+                 verbose=True):
+        assert linLogMode == adjustSizes == multiThreaded == False, "You selected a feature that has not been implemented yet..."
+        self.outboundAttractionDistribution = outboundAttractionDistribution
+        self.linLogMode = linLogMode
+        self.adjustSizes = adjustSizes
+        self.edgeWeightInfluence = edgeWeightInfluence
+        self.jitterTolerance = jitterTolerance
+        self.barnesHutOptimize = barnesHutOptimize
+        self.barnesHutTheta = barnesHutTheta
+        self.scalingRatio = scalingRatio
+        self.strongGravityMode = strongGravityMode
+        self.gravity = gravity
+        self.verbose = verbose
+        self.dataContainer = []
+
+    def init(self,
+             G,  # a graph in 2D numpy ndarray format (or) scipy sparse matrix format
+             pos=None  # Array of initial positions
+             ):
+        isSparse = False
+        if isinstance(G, numpy.ndarray):
+            # Check our assumptions
+            assert G.shape == (G.shape[0], G.shape[0]), "G is not 2D square"
+            assert numpy.all(G.T == G), "G is not symmetric.  Currently only undirected graphs are supported"
+            assert isinstance(pos, numpy.ndarray) or (pos is None), "Invalid node positions"
+        elif scipy.sparse.issparse(G):
+            # Check our assumptions
+            assert G.shape == (G.shape[0], G.shape[0]), "G is not 2D square"
+            assert isinstance(pos, numpy.ndarray) or (pos is None), "Invalid node positions"
+            G = G.tolil()
+            isSparse = True
+        else:
+            assert False, "G is not numpy ndarray or scipy sparse matrix"
+
+        # Put nodes into a data structure we can understand
+        nodes = []
+        for i in range(0, G.shape[0]):
+            n = fa2util.Node()
+            if isSparse:
+                n.mass = 1 + len(G.rows[i])
+            else:
+                n.mass = 1 + numpy.count_nonzero(G[i])
+            n.old_dx = 0
+            n.old_dy = 0
+            n.dx = 0
+            n.dy = 0
+            if pos is None:
+                n.x = random.random()
+                n.y = random.random()
+            else:
+                n.x = pos[i][0]
+                n.y = pos[i][1]
+            nodes.append(n)
+
+        # Put edges into a data structure we can understand
+        edges = []
+        es = numpy.asarray(G.nonzero()).T
+        for e in es:  # Iterate through edges
+            if e[1] <= e[0]: continue  # Avoid duplicate edges
+            edge = fa2util.Edge()
+            edge.node1 = e[0]  # The index of the first node in `nodes`
+            edge.node2 = e[1]  # The index of the second node in `nodes`
+            edge.weight = G[tuple(e)]
+            edges.append(edge)
+
+        return nodes, edges
+
+    # Given an adjacency matrix, this function computes the node positions
+    # according to the ForceAtlas2 layout algorithm.  It takes the same
+    # arguments that one would give to the ForceAtlas2 algorithm in Gephi.
+    # Not all of them are implemented.  See below for a description of
+    # each parameter and whether or not it has been implemented.
+    #
+    # This function will return a list of X-Y coordinate tuples, ordered
+    # in the same way as the rows/columns in the input matrix.
+    #
+    # The only reason you would want to run this directly is if you don't
+    # use networkx.  In this case, you'll likely need to convert the
+    # output to a more usable format.  If you do use networkx, use the
+    # "forceatlas2_networkx_layout" function below.
+    #
+    # Currently, only undirected graphs are supported so the adjacency matrix
+    # should be symmetric.
+    def forceatlas2(self,
+                    G,  # a graph in 2D numpy ndarray format (or) scipy sparse matrix format
+                    pos=None,  # Array of initial positions
+                    iterations=100  # Number of times to iterate the main loop
+                    ):
+        # Initializing, initAlgo()
+        # ================================================================
+
+        # speed and speedEfficiency describe a scaling factor of dx and dy
+        # before x and y are adjusted.  These are modified as the
+        # algorithm runs to help ensure convergence.
+        speed = 1.0
+        speedEfficiency = 1.0
+        nodes, edges = self.init(G, pos)
+        outboundAttCompensation = 1.0
+        if self.outboundAttractionDistribution:
+            outboundAttCompensation = numpy.mean([n.mass for n in nodes])
+        # ================================================================
+
+        # Main loop, i.e. goAlgo()
+        # ================================================================
+
+        barneshut_timer = Timer(name="BarnesHut Approximation")
+        repulsion_timer = Timer(name="Repulsion forces")
+        gravity_timer = Timer(name="Gravitational forces")
+        attraction_timer = Timer(name="Attraction forces")
+        applyforces_timer = Timer(name="AdjustSpeedAndApplyForces step")
+
+        # Each iteration of this loop represents a call to goAlgo().
+        niters = range(iterations)
+        if self.verbose:
+            niters = tqdm(niters)
+        for _i in niters:
+            for n in nodes:
+                n.old_dx = n.dx
+                n.old_dy = n.dy
+                n.dx = 0
+                n.dy = 0
+
+            # Barnes Hut optimization
+            if self.barnesHutOptimize:
+                barneshut_timer.start()
+                rootRegion = fa2util.Region(nodes)
+                rootRegion.buildSubRegions()
+                barneshut_timer.stop()
+
+            # Charge repulsion forces
+            repulsion_timer.start()
+            # parallelization should be implemented here
+            if self.barnesHutOptimize:
+                rootRegion.applyForceOnNodes(nodes, self.barnesHutTheta, self.scalingRatio)
+            else:
+                fa2util.apply_repulsion(nodes, self.scalingRatio)
+            repulsion_timer.stop()
+
+            # Gravitational forces
+            gravity_timer.start()
+            fa2util.apply_gravity(nodes, self.gravity, useStrongGravity=self.strongGravityMode)
+            gravity_timer.stop()
+
+            # If other forms of attraction were implemented they would be selected here.
+            attraction_timer.start()
+            fa2util.apply_attraction(nodes, edges, self.outboundAttractionDistribution, outboundAttCompensation,
+                                     self.edgeWeightInfluence)
+            attraction_timer.stop()
+
+            # Adjust speeds and apply forces
+            applyforces_timer.start()
+            values = fa2util.adjustSpeedAndApplyForces(nodes, speed, speedEfficiency, self.jitterTolerance)
+            speed = values['speed']
+            speedEfficiency = values['speedEfficiency']
+            applyforces_timer.stop()
+
+            self.dataContainer.append(np.array([(n.x, n.y) for n in nodes]))
+
+        if self.verbose:
+            if self.barnesHutOptimize:
+                barneshut_timer.display()
+            repulsion_timer.display()
+            gravity_timer.display()
+            attraction_timer.display()
+            applyforces_timer.display()
+        # ================================================================
+        return [(n.x, n.y) for n in nodes]
+
+    # A layout for NetworkX.
+    #
+    # This function returns a NetworkX layout, which is really just a
+    # dictionary of node positions (2D X-Y tuples) indexed by the node name.
+    def forceatlas2_networkx_layout(self, G, pos=None, iterations=100):
+        import networkx
+        assert isinstance(G, networkx.classes.graph.Graph), "Not a networkx graph"
+        assert isinstance(pos, dict) or (pos is None), "pos must be specified as a dictionary, as in networkx"
+        M = networkx.to_scipy_sparse_matrix(G, dtype='f', format='lil')
+        if pos is None:
+            l = self.forceatlas2(M, pos=None, iterations=iterations)
+        else:
+            poslist = numpy.asarray([pos[i] for i in G.nodes()])
+            l = self.forceatlas2(M, pos=poslist, iterations=iterations)
+        return dict(zip(G.nodes(), l))
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/setup.py
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/iterative_fa2/setup.py
@ -0,0 +1,75 @@
+from codecs import open
+from os import path
+
+from setuptools import setup
+
+print("Installing fa2 package (fastest forceatlas2 python implementation)\n")
+
+here = path.abspath(path.dirname(__file__))
+
+# Get the long description from the README file
+with open(path.join(here, 'README.md'), 'r') as f:
+    long_description = f.read()
+
+print(">>>> Cython is installed?")
+try:
+    from Cython.Distutils import Extension
+    from Cython.Build import build_ext
+    USE_CYTHON = True
+    print('Yes\n')
+except ImportError:
+    from setuptools.extension import Extension
+    USE_CYTHON = False
+    print('Cython is not installed; using pre-generated C files if available')
+    print('Please install Cython first and try again if you face any installation problems\n')
+    print(">>>> Are pre-generated C files available?")
+
+if USE_CYTHON:
+    ext_modules = [Extension('fa2.fa2util', ['fa2/fa2util.py', 'fa2/fa2util.pxd'], cython_directives={'language_level' : 3})]
+    cmdclass = {'build_ext': build_ext}
+    opts = {"ext_modules": ext_modules, "cmdclass": cmdclass}
+elif path.isfile(path.join(here, 'fa2/fa2util.c')):
+    print("Yes\n")
+    ext_modules = [Extension('fa2.fa2util', ['fa2/fa2util.c'])]
+    cmdclass = {}
+    opts = {"ext_modules": ext_modules, "cmdclass": cmdclass}
+else:
+    print("Pre-generated C files are not available. This library will be slow without Cython optimizations.\n")
+    opts = {"py_modules": ["fa2.fa2util"]}
+
+# Uncomment the following line if you want to install without optimizations
+# opts = {"py_modules": ["fa2.fa2util"]}
+
+print(">>>> Starting to install!\n")
+
+setup(
+    name='fa2',
+    version='0.3.5',
+    description='The fastest ForceAtlas2 algorithm for Python (and NetworkX)',
+    long_description_content_type='text/markdown',
+    long_description=long_description,
+    author='Bhargav Chippada',
+    author_email='bhargavchippada19@gmail.com',
+    url='https://github.com/bhargavchippada/forceatlas2',
+    download_url='https://github.com/bhargavchippada/forceatlas2/archive/v0.3.5.tar.gz',
+    keywords=['forceatlas2', 'networkx', 'force-directed-graph', 'force-layout', 'graph'],
+    packages=['fa2'],
+    classifiers=[
+        'Development Status :: 5 - Production/Stable',
+
+        'Intended Audience :: Science/Research',
+        'Topic :: Scientific/Engineering :: Mathematics',
+
+        'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
+
+        'Programming Language :: Python :: 2',
+        'Programming Language :: Python :: 3'
+    ],
+    install_requires=['numpy', 'scipy', 'tqdm'],
+    extras_require={
+        'networkx': ['networkx'],
+        'igraph': ['python-igraph']
+    },
+    include_package_data=True,
+    **opts
+)
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/make_fdg_animation.py
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/make_fdg_animation.py
@ -0,0 +1,108 @@
+
+from os import mkdir
+from os.path import exists
+from shutil import rmtree
+from fa2 import ForceAtlas2
+import pandas as pd
+from scipy.io import mmread
+import numpy as np
+import subprocess
+
+# smaller steps by:
+# - decrease barnesHutOptimize
+# - decrease gravity
+
+# number of frames
+frames = 2000
+# load pca, SNN and label colours data
+# the first 2 PC form PCA are used as initial conditions
+# SNN is used for building the force directed graph
+pca_data = pd.read_csv("./input/pca.csv", index_col = 0)
+labels_col   = pd.read_csv("./input/label_colours.csv", squeeze = True, index_col = 0)
+snn      = mmread("./input/SNN.smm")
+
+# set initialposition as the first 2 PCs
+positions = pca_data.values[:, 0:2]
+
+# initialize force directed graph class instance
+forceatlas2 = ForceAtlas2(outboundAttractionDistribution=False, linLogMode=False,
+                          adjustSizes=False, edgeWeightInfluence=1.0,
+                          jitterTolerance=1.0, barnesHutTheta = .8,
+                          barnesHutOptimize=True, multiThreaded=False,
+                          scalingRatio=2.0, strongGravityMode=True, gravity=1, verbose=True)
+
+# run force directed graph; for each iterations generates the coordinates use din each frame
+discard = forceatlas2.forceatlas2(G = snn, pos = positions, iterations = frames)
+
+if exists("./input/buffers"):
+    rmtree("./input/buffers")
+if exists("./input/frames"):
+    rmtree("./input/frames")
+mkdir("./input/buffers")
+mkdir("./input/frames")
+for index in range(len(forceatlas2.dataContainer)):
+    positions = forceatlas2.dataContainer[index]
+    fname = "./input/buffers/{index}.csv".format(index = index)
+    np.savetxt(fname, positions, delimiter = ",")
+    print("Saving buffer: {index}".format(index = index))
+    
+# run R
+subprocess.call(["Rscript", "make_plots.R"], shell = True)
+
+
+# assemble the frames into a video
+    
+import cv2
+import os
+
+def sortImages(imgPath):
+    return int(os.path.splitext(imgPath)[0])
+
+# Arguments
+dir_path = './input/frames'
+ext = "png"
+output = "fdg.mp4"
+
+images = []
+for f in os.listdir(dir_path):
+    if f.endswith(ext):
+        images.append(f)
+
+images = sorted(images, key = sortImages)
+
+legend = cv2.imread("./input/legend.png")
+lH, lW, chs = legend.shape
+legend = legend[0:(lH-10), 10:lW]
+legend = cv2.resize(legend, (0, 0), fx = .8, fy = .8)
+lH, lW, chs = legend.shape
+
+# Determine the width and height from the first image
+image_path = os.path.join(dir_path, images[0])
+frame = cv2.imread(image_path)
+cv2.imshow('video',frame)
+height, width, channels = frame.shape
+
+# Define the codec and create VideoWriter object
+fourcc = cv2.VideoWriter_fourcc(*'mp4v') # Be sure to use lower case
+out = cv2.VideoWriter(output, fourcc, 30.0, (width+792, height))
+import numpy as np
+for image in images:
+
+    image_path = os.path.join(dir_path, image)
+    frame = cv2.imread(image_path)
+    frame = cv2.resize(frame, (width, height))
+    lh1 = width + lW
+    template = np.zeros((height, lW, 3), dtype = frame.dtype)
+    frame = np.hstack((frame, template))
+    frame[0:lH, width:lh1, :] = legend
+
+    #cv2.putText(frame, "by Dorin-Mirel Popescu", (width - 400, height - 30), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), thickness = 2) 
+
+    out.write(frame) # Write out frame to video
+    print(image)
+
+# Release everything if job is finished
+out.release()
+cv2.destroyAllWindows()
+
+print("The output video is {}".format(output))
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/make_plots.R
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/make_plots.R
@ -0,0 +1,42 @@
+setwd("~/Documents/MyTools/force_abstract_graph_2Danimation/")
+buffers.addrs <- list.files("./input/buffers/", full.names=T)
+data.colours <- as.vector(read.csv("./input/label_colours.csv")$LabelCols)
+
+################################################################################################################
+################################################################################################################
+################################################################################################################
+library(RColorBrewer)
+library(dplyr)
+library(plyr)
+library(Seurat)
+
+#c.unique  <-as.vector( unique(data.colours))
+#c.colours <- sample(colorRampPalette(brewer.pal(12, "Paired"))(length(c.unique)))
+#data.colours <- factor(plyr::mapvalues(x=data.colours, from=c.unique, to = c.colours), levels = c.colours)
+
+################################################################################################################
+################################################################################################################
+################################################################################################################
+
+for(k in 1:length(buffers.addrs)){
+  buffer.addr <- buffers.addrs[k]
+  print(sprintf("Plotting frame %d", k))
+  buffer.data <- read.csv(buffer.addr, header = F)
+  buffer.data <- cbind(buffer.data, data.colours)
+  colnames(buffer.data) <- c("FDGX", "FDGY", "Colours")
+  limitX <- quantile(buffer.data$FDGX, c(.01, .99)) + c(-15000, 15000)
+  limitY <- 1.1 * quantile(buffer.data$FDGY, c(.01, .99)) + c(-15000, 15000)
+  plot.obj <- ggplot(data=buffer.data, aes(x = FDGX, y = FDGY))
+  plot.obj <- plot.obj + geom_point(show.legend=F, size = 1.5, color = as.vector(buffer.data$Colours))
+  plot.obj <- plot.obj + scale_color_manual(values=as.vector(buffer.data$Colours))
+  plot.obj <- plot.obj + theme(plot.background = element_rect(fill = "black"))
+  plot.obj <- plot.obj + scale_x_continuous(limits = limitX, expand = c(0, 0))
+  plot.obj <- plot.obj + scale_y_continuous(limits = limitY, expand = c(0, 0))
+  plot.obj <- plot.obj + theme(axis.title = element_blank(),
+                               axis.text = element_blank(),
+                               axis.ticks = element_blank())
+  fname <- file.path("./input/frames", sub(pattern=".csv", replacement=".png", x=basename(buffer.addr)))
+  png(fname, width = 2000, height = 2000)
+  print(plot.obj)
+  dev.off()
+}
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/prepare_input.R
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/prepare_input.R
@ -0,0 +1,71 @@
+# import libraries
+library(Seurat)
+library(plyr)
+
+seurat.obj.addr <- "../../seurat_data/liver_immune.RDS"
+
+# a plotting function for indexed legend; special modifications for current script
+plot.indexed.legend <- function(label.vector, color.vector, ncols = 2, left.limit = 3.4, symbol.size = 8, text.size = 10){
+  if (length(label.vector) != length(color.vector)){
+    stop("number of labels is different from number colors\nAdvice: learn to count!")
+  }
+  if (length(ncol) > length(label.vector)){
+    stop("You cannot have more columns than labels\nSolution: Learn to count")
+  }
+  indices.vector <- 1:length(label.vector)
+  label.no <- length(label.vector)
+  nrows <- ceiling(label.no / ncols)
+  legend.frame <- data.frame(X = rep(0, label.no), Y = rep(0, label.no), CS = color.vector, Txt = label.vector)
+  for (i in 1:label.no){
+    col.index <- floor(i / (nrows + 1)) + 1
+    row.index <- 15 - ((i - 1) %% nrows + 1)
+    legend.frame[i, 1] <- (col.index - 1) * 2
+    legend.frame[i, 2] <- row.index
+  }
+  plot.obj <- ggplot(data = legend.frame, aes(x = X, y = Y))
+  plot.obj <- plot.obj + geom_point(size = symbol.size, colour = color.vector)
+  plot.obj <- plot.obj + scale_x_continuous(limits = c(0, left.limit)) + theme_void()
+  plot.obj <- plot.obj + annotate("text", x=legend.frame$X+.1, y = legend.frame$Y, label=legend.frame$Txt, hjust = 0, size = text.size, colour = "white")
+  plot.obj <- plot.obj + theme(panel.background = element_rect(fill='black'))
+  return(plot.obj)
+}
+
+# load the seurat object
+print("Loading the data ... ")
+seurat.obj <- readRDS(seurat.obj.addr)
+cell.type.to.colour <- read.csv("./liver_cell_type_colours.csv")
+
+seurat.obj <- SetAllIdent(object=seurat.obj, id="cell.labels")
+
+################################################
+print("saving pca data ...")
+pca.data <- seurat.obj@dr$pca@cell.embeddings
+write.csv(pca.data, "./input/pca.csv")
+
+################################################
+print("Computing and saving KNN graph ...")
+seurat.obj <- BuildSNN(object=seurat.obj, reduction.type="pca", dims.use=1:20, plot.SNN=F, force.recalc=T)
+writeMM(obj=seurat.obj@snn, file="./input/SNN.smm")
+
+labels        <- as.vector(seurat.obj@ident)
+labels.unique <- unique(labels)
+filter.key    <- cell.type.to.colour$CellTypes %in% labels.unique
+cell.labels   <- cell.type.to.colour$CellTypes[filter.key]
+cell.colours  <- cell.type.to.colour$Colours[filter.key]
+
+labels.cols <- mapvalues(x=labels, from=as.vector(cell.labels), to=as.vector(cell.colours))
+write.csv(data.frame(LabelCols = labels.cols), "./input/label_colours.csv")
+
+png("./input/legend.png", width = 1000, height = 800)
+legend.plt <- plot.indexed.legend(label.vector=cell.labels, color.vector=cell.colours, left.limit=3.6, text.size=10, ncols=2, symbol.size = 15)
+print(legend.plt)
+dev.off()
+
+print("End")
+
+
+
+
+
+
+
--- a/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/prepare_input.sh
+++ b/pipelines/14_fdg_animation_write_input/force_abstract_graph_2Danimation/prepare_input.sh
@ -0,0 +1,11 @@
+#!/bin/bash
+
+#$ -cwd
+#$ -N prepare_input
+#$ -V
+#$ -l h_rt=23:59:59
+#$ -l h_vmem=400G
+
+Rscript prepare_input.R
+
+echo "End on `date`"
--- a/pipelines/14_fdg_animation_write_input/input/SNN.smm
+++ b/pipelines/14_fdg_animation_write_input/input/SNN.smm
--- a/pipelines/14_fdg_animation_write_input/input/label_colours.csv
+++ b/pipelines/14_fdg_animation_write_input/input/label_colours.csv
--- a/pipelines/14_fdg_animation_write_input/input/legend.png
+++ b/pipelines/14_fdg_animation_write_input/input/legend.png
--- a/pipelines/14_fdg_animation_write_input/input/pca.csv
+++ b/pipelines/14_fdg_animation_write_input/input/pca.csv
--- a/pipelines/14_fdg_animation_write_input/write_data.R
+++ b/pipelines/14_fdg_animation_write_input/write_data.R
@ -0,0 +1,112 @@
+# labels have been updated, should remove the part that overwrites cell labels
+# must create functions that handle the formation of FDG animation:
+# - data writter
+# - plotting that takes dimenssion parameters
+
+library(plyr)
+library(RColorBrewer)
+library(Seurat)
+
+seurat.addr    <- "../../data/test_yolk_sac_subset.RDS"
+seurat.obj <- readRDS(seurat.addr)
+cell.type.to.colour <- read.csv("../../resources/test_yolk_sac_fdg_colour_key.csv")
+
+print("Checking for doublets:")
+print(table(seurat.obj@meta.data$doublets))
+
+# a plotting function for indexed legend; special modifications for current script
+plot.indexed.legend <- function(label.vector, color.vector, ncols = 2, left.limit = 3.4, symbol.size = 8, text.size = 10){
+  if (length(label.vector) != length(color.vector)){
+    stop("number of labels is different from number colors\nAdvice: learn to count!")
+  }
+  if (length(ncol) > length(label.vector)){
+    stop("You cannot have more columns than labels\nSolution: Learn to count")
+  }
+  indices.vector <- 1:length(label.vector)
+  label.no <- length(label.vector)
+  nrows <- ceiling(label.no / ncols)
+  legend.frame <- data.frame(X = rep(0, label.no), Y = rep(0, label.no), CS = color.vector, Txt = label.vector)
+  for (i in 1:label.no){
+    col.index <- floor(i / (nrows + 1)) + 1
+    row.index <- 15 - ((i - 1) %% nrows + 1)
+    legend.frame[i, 1] <- (col.index - 1) * 2
+    legend.frame[i, 2] <- row.index
+  }
+  plot.obj <- ggplot(data = legend.frame, aes(x = X, y = Y))
+  plot.obj <- plot.obj + geom_point(size = symbol.size, colour = color.vector)
+  plot.obj <- plot.obj + scale_x_continuous(limits = c(0, left.limit)) + theme_void()
+  plot.obj <- plot.obj + annotate("text", x=legend.frame$X+.1, y = legend.frame$Y, label=legend.frame$Txt, hjust = 0, size = text.size, colour = "white")
+  plot.obj <- plot.obj + theme(panel.background = element_rect(fill='black'))
+  return(plot.obj)
+}
+
+# a plotting function for indexed legend
+plot.indexed.legend <- function(label.vector, color.vector, ncols = 2, left.limit = 3.4, symbol.size = 8, text.size = 10, padH = 1, padV = 1, padRight = 0){
+  if (length(label.vector) != length(color.vector)){
+    stop("number of labels is different from number colors\nAdvice: learn to count!")
+  }
+  if (length(ncol) > length(label.vector)){
+    stop("You cannot have more columns than labels\nSolution: Learn to count")
+  }
+  indices.vector <- 1:length(label.vector)
+  label.no <- length(label.vector)
+  nrows <- ceiling(label.no / ncols)
+  legend.frame <- data.frame(X = rep(0, label.no), Y = rep(0, label.no), CS = color.vector, Txt = label.vector)
+  legend.frame$X <- rep(1:ncols, each=nrows)[1:nrow(legend.frame)]
+  legend.frame$Y <- rep(nrows:1, times = ncols)[1:nrow(legend.frame)]
+  Xrange <- range(legend.frame$X)
+  Yrange <- range(legend.frame$Y)
+  plot.obj <- ggplot(data = legend.frame, aes(x = X, y = Y))
+  plot.obj <- plot.obj + geom_point(size = symbol.size, colour = color.vector)
+  plot.obj <- plot.obj + scale_x_continuous(limits = c(Xrange[1] - padRight, Xrange[2] + padH))
+  plot.obj <- plot.obj + scale_y_continuous(limits = c(Yrange[1] - padV, Yrange[2] + padV))
+  plot.obj <- plot.obj + theme_void()
+  
+  plot.obj <- plot.obj + annotate("text", x=legend.frame$X, y = legend.frame$Y, label = indices.vector, size = text.size)
+  plot.obj <- plot.obj + annotate("text", x=legend.frame$X+.1, y = legend.frame$Y, label=legend.frame$Txt, hjust = 0, size = text.size, colour = "white")
+  plot.obj <- plot.obj + theme(panel.background = element_rect(fill='black'))
+  return(plot.obj)
+}
+
+pca.data <- seurat.obj@dr$pca@cell.embeddings
+write.csv(pca.data, "./input/pca.csv")
+
+seurat.obj <- BuildSNN(object=seurat.obj, reduction.type="pca", dims.use=1:20, plot.SNN=F,force.recalc=T)
+writeMM(obj=seurat.obj@snn, file="./input/SNN.smm")
+
+labels        <- as.vector(seurat.obj@meta.data$cell.labels)
+labels.unique <- unique(labels)
+
+print("printing cell.type.to.colour")
+print(cell.type.to.colour)
+
+print("!is.na(cell.type.to.colour)")
+print(!is.na(cell.type.to.colour))
+
+if(!is.na(cell.type.to.colour)){
+  cell.labels  <- as.vector(cell.type.to.colour$CellTypes)
+  cell.colours <- as.vector(cell.type.to.colour$Colours)
+  filter.key <- cell.labels %in% labels.unique
+  cell.labels <- cell.labels[filter.key]
+  cell.colours <- cell.colours[filter.key]
+}else{
+  cell.labels <- labels.unique
+  set.seed(100)
+  cell.colours <- sample(colorRampPalette(brewer.pal(12, "Paired"))(length(labels.unique)))
+}
+
+print("printing cell.labels")
+print(cell.labels)
+print("printing cell.colours")
+print(cell.colours)
+
+labels.cols <- mapvalues(x=labels, from=cell.labels, to=cell.colours)
+write.csv(data.frame(LabelCols = labels.cols), "./input/label_colours.csv")
+
+png("./input/legend.png", width = 1000, height = 700)
+legend.plt <- plot.indexed.legend(label.vector=cell.labels, color.vector=cell.colours, ncols=2, left.limit=0, symbol.size=17, text.size=10, padH=.9, padV=.6)
+print(legend.plt)
+dev.off()
+
+print("ended beautifully")
+
--- a/pipelines/14_fdg_animation_write_input/write_data.sh
+++ b/pipelines/14_fdg_animation_write_input/write_data.sh
@ -0,0 +1,11 @@
+#!/bin/bash
+
+#$ -cwd
+#$ -N write_data
+#$ -V
+#$ -l h_rt=23:59:59
+#$ -l h_vmem=400G
+
+Rscript write_data.R
+
+echo "End on `date`"