Application Example: Silktorrent Package Manager
Not logged in

Acronym: mmmv_pkgm_t1

The core idea

The Silktorrent packets are tar-files that are described by their secure hash, not storage location. That allows the tar-files to be distributed from multiple storage locations, "mirrors", without giving any party the opportunity to change the files without making the files to fail their integrity checks. The "mirrors" give the redundancy and part of the censorship countermeasures. 


Theory

There is an assumption that backwards compatibility of libraries does not exist, even if advertised. Whenever a software component dependency is declared by stating that the version of the dependency is "newest" or "greater than X", a flaw is introduced. That is the reason, why Linux/BSD/etc. package collections are "unstable" and NEVER WILL BE "stable". For built/compiled software components each combination of build parameters, compiler, environment set-up is a separate version of the software component


Implementation Notes


API for the WWW and data Storage Devices

Silktorrent packets, the tar-files with the file extension of "stblob", are served by an ordinary web server the same way downloadable files, for example JPEG files, are generally served. The stblob-files are in folders that have arbitrary names(file system implementation limits like the maximum name length, do apply) and in addition to the stblob-files those folders can also contain files that are not stblob-files. One of those optionally present files is a text file named find_stblob.txt. The format of the find_stblob.txt is:


    find . -name '*.stblob' > ./find_stblob.txt


except that the find_stblob.txt is also allowed to contain blank lines and the classical comments that start with the # and/or // . The parsing of the find_stblob.txt is required to include a string trimming operation, where spaces and tabulation characters are removed from the start and end of find_stblob.txt lines. A path from a leaf of a file system directory tree to the root of the file system directory tree is allowed to contain N_find_stblob_txt instances of the find_stblob.txt, where the

        N_find_stblob_txt is in range 
       <square bracket>
           0, <number of vertices on the path>
       <square bracket>

The find_stblob.txt does not need to be static, for example, its content may change multiple times per second. stblob-files that are listed at the find_stblob.txt are NOT guaranteed to be present and stblob-files are allowed to go offline in the midst of their download sessions. In the case of data storage devices like USB sticks and DVDs the web server and internet connection parts are  omitted and the client is meant to read the find_stblob.txt and the stblob-files directly from the data storage device, "the disk".



An Optional Storage Allocation Policy

To make the Silktorrent network of package/packet hosting servers  more reliable, the package/packet hosting servers should use multiple storage allocation policies simultaneously. There is one allocation agent per policy. Each allocation agent has its own, fixed, size of storage space, disk space, which will be allocated to Silktorrent packets according to the policy that the agent implements. Some agent, bot, may sell paid subscriptions like the Dropbox and alike use. Some agent might run a mirroring service in favor of some public library or operating system packages repository. Some agent might store Silktorrent packets according to popularity. Some agent may offer personal storage service to the owner of the server. Some agent may service some Silktorrent based messaging service. Some agent might service Silktorrent based "web" (PDF-files, LibreOffice files, all-in-one-HTML-documents, etc.)

To avoid duplicate copies of the same Silktorrent packet at the same storage server, the agents of a single storage server may use a single "storage engine" that keeps track of the storage space quotas of the agents and physically deletes the Silktorrent packet only, if no agent on this sotorage server wants to store that Silktorrent packet.



Storage size Requirements

The hashes within the Silktorrent packet names depend on the tar files, silktorrent packets. Hash of a tar-file depends on the file attributes, including the date-and-time attributes. That's why it is not possible to recreate a Silktorrent packet from unmodified content of an unpacked Silktorrent packet without some special hacking, meaning: copies of downloaded Silktorrent packets must not be deleted, if they are going to be passed on somewhere. To use the downloaded Silktorrent packets without un-tar-ing them and waiting for the slow HDD operation to finish, an unmodified set of un-tar-red Silktorrent packets must be stored. Given the huge amount of Silktorrent packets, the packets must be distributed among a set of folders, because otherwise the file systems are slow to use. The recursive folder names may be derived from the first letters of Silktorrent packet names. The Silktorrent packet names were intentionally designed to contain equally distributed letters at the start of their packet names, because that allows database indices that index the Silktorrent packet names to probabilistically work faster.

Silktorrent packets, tar-files, that contain software/datacollection forks or newer versions of the software/datacollection, might consist of many bitstreams that match with the original version of the software/datacollection Silktorrent packets. Storage space for storing a collection of tar-files that contain relatively long, common, bitstreams, can be reduced by storing the common, relatively long, bitstreams only once. The search for  the common bitstream tokens can be greatly mitigated by explicitly telling, which Silktorrent packets, tar-files, are version wise or fork wise closely related. The arising knapsack problems might be solved by using specialized open source libraries. Compression algorithm development and the search for common bitstream tokens might get some inspiration from genetics software, which might be even re-used in some cases. It is OK for the compression software to run as a not-that-well-optimized background task, as long as the decompression is really fast.



Example use Cases

A Silktorrent packet is a tar-file that contains folders 

payload

header

In software projects and HTML pages  the content of the folders payload and header can be referenced by using the tree encoding(archival copy):

<tar-file name>/payload/<the file or folder relative to the folder payload>

<tar-file name>/header/<the file or folder relative to the folder header>

The <tar-file name> can be the name of a local folder or some folder at some publicly hosted web page. The include/require/src/uses statements of various programming languages and configurations files, including HTML, can be modified by changing the prefix of the <tar-file name>. That allows an HTML-page to be switched from using JavaScript libraries from one site to using the very same JavaScript libraries from another site.

http://www.first_site.com/<tar-file name>/payload/the_JavaScript_library.js

http://www.second_site.com/<tar-file name>/payload/the_JavaScript_library.js

Due to the possibility to use regular expressions, the proposed solution does not require code generation. It might be usable with many existing IDE-s without requiring any additional tools or plugins. The solution is also programming language agnostic.




Partial list of Similar Software