Fixed NS issue. Alt xsl for notes without offsets. Added file handling steps.

This commit is contained in:
Thomas Gideon 2010-10-16 11:34:10 -04:00
parent 4c5423c55d
commit d0269672ec
3 changed files with 24 additions and 7 deletions

3
README
View file

@ -5,5 +5,6 @@ encode.bash - Drives a set of encoders and tagging utilities to convert a single
relink.py - A script intended for one time use to tweak a feed to re-link its enclosures to appropriate URLs at the Internet Archive. relink.py - A script intended for one time use to tweak a feed to re-link its enclosures to appropriate URLs at the Internet Archive.
tidyrc - Tidy config that approximates the formatting of the live feeds to minimize disruption. tidyrc - Tidy config that approximates the formatting of the live feeds to minimize disruption.
publish.bash - Script to automate as much of the publishing step as possible. publish.bash - Script to automate as much of the publishing step as possible.
outline.xsl - Transform that handles the recursive structure of OmniOutliner files better than Beautiful Soup does. with_offset.xsl - Transform that handles the recursive structure of OmniOutliner files better than Beautiful Soup does, works with completed show notes that have time offsets.
without_offset.xsl - Transform that handles the recursive structure of OmniOutliner files better than Beautiful Soup does, works with segment notes that do not have time offsets.
outline.bash - Drives the XSLT operation and subsequent scripting tasks that cannot be handled in XSL. outline.bash - Drives the XSLT operation and subsequent scripting tasks that cannot be handled in XSL.

View file

@ -28,12 +28,30 @@
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# TODO copy .gz file based on arg
# TODO gunzip original file src=$1/contents.xml
xalan -xsl outline.xsl -in contents.xml -text -out contents.txt shift
target=$1
# copy .gz file based on arg
cp "$src" ./contents.xml.gz
# gunzip original file
gunzip contents.xml.gz
# makes sure the ns is only on one line, also
# makes the interim file more readable for debugging
tidy -config tidyrc -xml -m contents.xml
# xalan doesn't handle ns well, stripp them
sed -e "s/ xmlns=\".*\"//g" -i contents.xml
# TODO use grep to figure out which xsl to use
xalan -xsl without_offset.xsl -in contents.xml -text -out contents.txt
# expand the indent counts to proper leading white space
sed -e "s/^2/ /" -i contents.txt sed -e "s/^2/ /" -i contents.txt
sed -e "s/^3/ /" -i contents.txt sed -e "s/^3/ /" -i contents.txt
sed -e "s/^4/ /" -i contents.txt sed -e "s/^4/ /" -i contents.txt
sed -e "s/^5/ /" -i contents.txt sed -e "s/^5/ /" -i contents.txt
sed -e "s/^6/ /" -i contents.txt sed -e "s/^6/ /" -i contents.txt
less contents.txt
# snug the result where requested
mv contents.txt "$target"
# clean up the temporary files
rm contents.xml

View file

@ -6,8 +6,6 @@
<xsl:call-template name="item"> <xsl:call-template name="item">
<xsl:with-param name="indent" <xsl:with-param name="indent"
select="1"/> select="1"/>
<xsl:with-param name="prefix"
select="*"/>
</xsl:call-template> </xsl:call-template>
</xsl:for-each> </xsl:for-each>
</xsl:template> </xsl:template>